[go: up one dir, main page]

CN118428858B - Warehouse management method, device, equipment and medium based on large language model - Google Patents

Warehouse management method, device, equipment and medium based on large language model Download PDF

Info

Publication number
CN118428858B
CN118428858B CN202410817209.0A CN202410817209A CN118428858B CN 118428858 B CN118428858 B CN 118428858B CN 202410817209 A CN202410817209 A CN 202410817209A CN 118428858 B CN118428858 B CN 118428858B
Authority
CN
China
Prior art keywords
code
language model
target
large language
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410817209.0A
Other languages
Chinese (zh)
Other versions
CN118428858A (en
Inventor
马仲能
吴庆耀
钟远光
伍聪烜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202410817209.0A priority Critical patent/CN118428858B/en
Publication of CN118428858A publication Critical patent/CN118428858A/en
Application granted granted Critical
Publication of CN118428858B publication Critical patent/CN118428858B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/08Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
    • G06Q10/087Inventory or stock management, e.g. order filling, procurement or balancing against orders
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • G06F40/49Data-driven translation using very large corpora, e.g. the web
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Business, Economics & Management (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Economics (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Human Resources & Organizations (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a warehouse management method, a device, equipment and a medium based on a large language model, which relate to the field of large models and comprise the following steps: creating a warehouse database based on preset warehouse information and preset cargo information; the initial state of the warehouse database is empty; constructing an initial large language model, and training the initial large language model based on a preset corpus to obtain a target large language model; and generating an operation code of the warehouse database based on a preset code generation method, and accessing the target large language model into the warehouse database so as to carry out warehouse management on the warehouse database by using the operation code based on the target large language model. By constructing a large language model and accessing a database, related personnel can automatically solve related requirements of users by using natural language to communicate with the system without a special operation interface, and the warehousing system improves the warehousing management efficiency and the user experience.

Description

一种基于大语言模型的仓储管理方法、装置、设备及介质A warehouse management method, device, equipment and medium based on large language model

技术领域Technical Field

本发明涉及大模型领域,特别涉及一种基于大语言模型的仓储管理方法、装置、设备及介质。The present invention relates to the field of large models, and in particular to a warehouse management method, device, equipment and medium based on a large language model.

背景技术Background Art

目前行业内对于仓储物流管理系统的解决方案绝大多数仍是由需求方(仓储管理人员)提出需求,由软件开发人员往系统中添加功能以满足需求。但是这种方法在进行仓储管理时,不具备扩展性,因此功能仅局限于人为设置,无法将系统与其它系统(如财务、客服)等进行整合。At present, most of the solutions for warehouse logistics management systems in the industry are still proposed by the demand side (warehouse managers), and software developers add functions to the system to meet the needs. However, this method is not scalable when conducting warehouse management, so the functions are limited to manual settings and the system cannot be integrated with other systems (such as finance, customer service, etc.).

近年来出现的基于AI(Artificial Intelligence,人工智能)的仓储物流管理方法也仅仅涉及到处理物流数据的大模型,无法对人类自然语言进行处理,也就无法实现人与系统交互的问题。因此,如何在仓储管理过程中提高用户的交互体验是本领域有待解决的问题。The warehousing and logistics management methods based on AI (Artificial Intelligence) that have emerged in recent years only involve large models for processing logistics data, and cannot process human natural language, and thus cannot achieve the problem of human-system interaction. Therefore, how to improve the user interaction experience in the warehousing management process is a problem to be solved in this field.

发明内容Summary of the invention

有鉴于此,本发明的目的在于提供一种基于大语言模型的仓储管理方法、装置、设备及存储介质,通过建构大语言模型并接入数据库,无需专门的操作界面,相关人员通过使用自然语言与系统进行交流,仓储系统即可自动解决用户的相关需求,提高了仓储管理的效率以及用户的体验。其具体方案如下:In view of this, the purpose of the present invention is to provide a warehouse management method, device, equipment and storage medium based on a large language model. By constructing a large language model and accessing a database, there is no need for a special operation interface. Relevant personnel communicate with the system using natural language, and the warehouse system can automatically solve the relevant needs of users, thereby improving the efficiency of warehouse management and the user experience. The specific scheme is as follows:

第一方面,本申请提供了一种基于大语言模型的仓储管理方法,包括:In a first aspect, the present application provides a warehouse management method based on a large language model, comprising:

基于预设仓库信息和预设货物信息创建仓储数据库;所述仓储数据库的初始状态为空;Creating a storage database based on preset warehouse information and preset cargo information; the initial state of the storage database is empty;

构建初始大语言模型,并基于预设语料库对所述初始大语言模型进行训练,以得到目标大语言模型;Constructing an initial large language model, and training the initial large language model based on a preset corpus to obtain a target large language model;

基于预设代码生成方法生成所述仓储数据库的操作代码,并将所述目标大语言模型接入所述仓储数据库,以基于所述目标大语言模型利用所述操作代码对所述仓储数据库进行仓储管理。An operation code of the storage database is generated based on a preset code generation method, and the target large language model is connected to the storage database, so as to perform storage management on the storage database using the operation code based on the target large language model.

可选的,所述操作代码为基于结构化查询语言构建的代码。Optionally, the operation code is a code constructed based on structured query language.

可选的,所述基于预设代码生成方法生成所述仓储数据库的操作代码,包括:Optionally, the generating the operation code of the warehouse database based on a preset code generating method includes:

基于所述目标大语言模块构建初始代码生成模型;Building an initial code generation model based on the target large language module;

获取用户输入的目标代码对,并基于所述目标代码对训练所述初始代码生成模型,以根据得到的目标代码模型基于用户指令生成相应的操作代码;Obtaining a target code pair input by a user, and training the initial code generation model based on the target code pair, so as to generate a corresponding operation code based on the user instruction according to the obtained target code model;

其中,所述目标代码对包括用户输入的代码以及相应的动作。The target code pair includes a code input by a user and a corresponding action.

可选的,所述基于所述目标大语言模块构建初始代码生成模型,包括:Optionally, constructing an initial code generation model based on the target large language module includes:

在所述目标大语言模型的隐藏层中增加代码编码器、代码解码器和降维层;Adding a code encoder, a code decoder and a dimensionality reduction layer to the hidden layer of the target large language model;

相应的,所述获取用户输入的目标代码对,并基于所述目标代码对训练所述初始代码生成模型,包括:Accordingly, the step of obtaining the target code pair input by the user and training the initial code generation model based on the target code pair includes:

将所述目标代码对的动作输入至所述目标大语言模型,得到所述动作对应的特征,并将所述特征通过所述降维层进行降维得到降维后特征;Inputting the action of the target code pair into the target large language model to obtain features corresponding to the action, and reducing the dimensions of the features through the dimension reduction layer to obtain reduced-dimensional features;

将所述目标代码对的代码输入至所述代码编码器,得到所述代码对应的编码;Inputting the code of the target code pair into the code encoder to obtain the code corresponding to the code;

基于所述降维后特征和所述编码确定所述目标代码对应的第一损失;Determine a first loss corresponding to the target code based on the dimensionally reduced features and the encoding;

将所述编码输入至所述代码解码器得到对应的logit值,并基于均方误差函数确定所述logit值和所述代码之间的第二损失;Inputting the code into the code decoder to obtain a corresponding logit value, and determining a second loss between the logit value and the code based on a mean square error function;

基于预设超参数确定所述第一损失和所述第二损失的总损失,并基于所述总损失确定训练所述初始代码生成模型得到所述目标代码模型。The total loss of the first loss and the second loss is determined based on preset hyperparameters, and the initial code generation model is trained to obtain the target code model based on the total loss.

可选的,所述获取用户输入的目标代码对之后,还包括:Optionally, after obtaining the target code pair input by the user, the method further includes:

对所述目标代码对进行数据增强,以利用增强后的所述目标代码对训练所述初始代码生成模型。Data enhancement is performed on the target code pair to train the initial code generation model using the enhanced target code pair.

可选的,所述基于所述目标大语言模型利用所述操作代码对所述仓储数据库进行仓储管理时,包括:Optionally, the performing storage management on the storage database by using the operation code based on the target large language model includes:

确定所述用户指令对应的特征,并通过所述目标大语言模型中的预设二分类单元基于所述用户指令对应的特征判断所述用户指令是否为所述仓储数据库的访问指令,以根据判断结果利用所述操作代码对所述仓储数据库进行仓储管理。Determine the features corresponding to the user instruction, and determine whether the user instruction is an access instruction to the warehouse database based on the features corresponding to the user instruction through a preset binary classification unit in the target large language model, so as to perform warehouse management on the warehouse database using the operation code according to the judgment result.

可选的,所述根据判断结果对所述仓储数据库进行仓储管理,包括:Optionally, performing warehouse management on the warehouse database according to the judgment result includes:

若所述用户指令为所述仓储数据库的所述访问指令,则利用所述降维层确定所述用户指令对应的降维后特征;If the user instruction is the access instruction of the storage database, the dimension reduction layer is used to determine the dimension-reduced features corresponding to the user instruction;

利用所述代码解码器得到所述用户指令对应的降维后特征的结构化查询代码,并基于所述结构化查询代码访问所述仓储数据库得到访问结果;Using the code decoder to obtain a structured query code of the dimension-reduced feature corresponding to the user instruction, and accessing the storage database based on the structured query code to obtain an access result;

将所述访问结果输入所述目标大语言模型,得到对应的访问后特征,并基于所述访问后特征更新所述用户指令对应的特征,以及将更新后的特征输入所述目标大语言模型。The access result is input into the target large language model to obtain corresponding post-access features, and the features corresponding to the user instructions are updated based on the post-access features, and the updated features are input into the target large language model.

第二方面,本申请提供了一种基于大语言模型的仓储管理装置,包括:In a second aspect, the present application provides a warehouse management device based on a large language model, comprising:

数据库构建模块,用于基于预设仓库信息和预设货物信息创建仓储数据库;所述仓储数据库的初始状态为空;A database construction module, used to create a storage database based on preset warehouse information and preset goods information; the initial state of the storage database is empty;

模型训练模块,用于构建初始大语言模型,并基于预设语料库对所述初始大语言模型进行训练,以得到目标大语言模型;A model training module, used to construct an initial large language model and train the initial large language model based on a preset corpus to obtain a target large language model;

仓储管理模块,用于基于预设代码生成方法生成所述仓储数据库的操作代码,并将所述目标大语言模型接入所述仓储数据库,以基于所述目标大语言模型利用所述操作代码对所述仓储数据库进行仓储管理。The warehouse management module is used to generate an operation code for the warehouse database based on a preset code generation method, and connect the target large language model to the warehouse database, so as to perform warehouse management on the warehouse database based on the target large language model using the operation code.

第三方面,本申请提供了一种电子设备,所述电子设备包括处理器和存储器;其中,所述存储器用于存储计算机程序,所述计算机程序由所述处理器加载并执行以实现前述的基于大语言模型的仓储管理方法。In a third aspect, the present application provides an electronic device comprising a processor and a memory; wherein the memory is used to store a computer program, and the computer program is loaded and executed by the processor to implement the aforementioned warehouse management method based on a large language model.

第四方面,本申请提供了一种计算机可读存储介质,用于保存计算机程序,所述计算机程序被处理器执行时实现前述的基于大语言模型的仓储管理方法。In a fourth aspect, the present application provides a computer-readable storage medium for storing a computer program, which, when executed by a processor, implements the aforementioned warehouse management method based on a large language model.

本申请中首先基于预设仓库信息和预设货物信息创建仓储数据库,以及构建初始大语言模型,并基于预设语料库对所述初始大语言模型进行训练,得到目标大语言模型,然后基于预设代码生成方法生成所述仓储数据库的操作代码,并将所述目标大语言模型接入所述仓储数据库,以基于所述目标大语言模型利用所述操作代码对所述仓储数据库进行仓储管理。这样一来,本申请通过建构大语言模型,并基于大模型训练方法使用海量语料库对模型进行训练,使其具备基本的语言理解能力和推理能力后,将大语言模型接入数据库,无需专门的操作界面,相关人员通过使用自然语言与系统进行交流,仓储系统即可自动解决用户的相关需求,免除了程序开发者针对特定功能来改动代码所需的成本,提高了仓储管理的效率以及用户的体验。In this application, a storage database is first created based on preset warehouse information and preset cargo information, and an initial large language model is constructed. The initial large language model is trained based on a preset corpus to obtain a target large language model. Then, an operation code for the storage database is generated based on a preset code generation method, and the target large language model is connected to the storage database, so that the storage database can be managed based on the target large language model using the operation code. In this way, this application constructs a large language model, and trains the model using a massive corpus based on a large model training method, so that it has basic language understanding and reasoning capabilities. Then, the large language model is connected to the database. No special operation interface is required. Relevant personnel communicate with the system using natural language, and the storage system can automatically solve the user's related needs, eliminating the cost required for program developers to modify the code for specific functions, thereby improving the efficiency of warehouse management and the user experience.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings required for use in the embodiments or the description of the prior art will be briefly introduced below. Obviously, the drawings described below are only embodiments of the present invention. For ordinary technicians in this field, other drawings can be obtained based on the provided drawings without paying creative work.

图1为本申请提供的一种基于大语言模型的仓储管理方法流程图;FIG1 is a flow chart of a warehouse management method based on a large language model provided by the present application;

图2为本申请提供的一种具体的基于大语言模型的仓储管理方法流程图;FIG2 is a flowchart of a specific warehouse management method based on a large language model provided by the present application;

图3为本申请提供的一种代码对比方法示意图;FIG3 is a schematic diagram of a code comparison method provided by the present application;

图4为本申请提供的一种基于大语言模型的仓储管理装置结构示意图;FIG4 is a schematic diagram of the structure of a warehouse management device based on a large language model provided by the present application;

图5为本申请提供的一种电子设备结构图。FIG5 is a structural diagram of an electronic device provided in this application.

具体实施方式DETAILED DESCRIPTION

下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will be combined with the drawings in the embodiments of the present invention to clearly and completely describe the technical solutions in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.

目前行业内对于仓储物流管理系统的解决方案绝大多数仍是由需求方提出需求,由软件开发人员往系统中添加功能以满足需求,但是这种方法在进行仓储管理时,不具备扩展性,仅局限于人为设置,无法将系统与其它系统等进行整合,而本申请通过建构大语言模型进行训练,使其具备基本的语言理解能力和推理能力后,将大语言模型接入数据库,无需专门的操作界面,相关人员通过使用自然语言与系统进行交流,仓储系统即可自动解决用户的相关需求。At present, most of the solutions for warehouse logistics management systems in the industry are still proposed by the demand side, and software developers add functions to the system to meet the needs. However, this method is not scalable when conducting warehouse management, and is limited to manual settings. It is impossible to integrate the system with other systems. This application constructs a large language model for training, so that it has basic language understanding and reasoning capabilities, and then connects the large language model to the database. Without the need for a special operating interface, relevant personnel can communicate with the system using natural language, and the warehousing system can automatically solve the user's related needs.

参见图1所示,本发明实施例公开了一种基于大语言模型的仓储管理方法,包括:As shown in FIG1 , an embodiment of the present invention discloses a warehouse management method based on a large language model, including:

步骤S11、基于预设仓库信息和预设货物信息创建仓储数据库;所述仓储数据库的初始状态为空。Step S11: Create a storage database based on preset warehouse information and preset cargo information; the initial state of the storage database is empty.

本实施例中首先需要根据预先设定的仓库信息和货物信息,或通用的仓储数据库模板构建应用大语言模型的仓储数据库,以通过该数据库结构创建和修改仓库、货物等信息。可以理解的是,该仓储数据库的初始状态为空,在添加仓库或修改仓库中的货物时数据库将被更新,并且该数据库具体可以分为仓库信息、仓库内的物品信息、仓库的事务信息等。In this embodiment, it is necessary to first construct a warehouse database using a large language model according to the pre-set warehouse information and goods information, or a general warehouse database template, so as to create and modify warehouse, goods and other information through the database structure. It can be understood that the initial state of the warehouse database is empty, and the database will be updated when adding a warehouse or modifying goods in the warehouse, and the database can be specifically divided into warehouse information, item information in the warehouse, warehouse transaction information, etc.

步骤S12、构建初始大语言模型,并基于预设语料库对所述初始大语言模型进行训练,以得到目标大语言模型。Step S12: construct an initial large language model, and train the initial large language model based on a preset corpus to obtain a target large language model.

本实施例中,可以构建用于仓储数据库管理的初始大语言模型,并通过预先设置好的语料库对该大语言模型进行训练得到目标大语言模型,以便基于大模型训练方法使用海量语料库对大语言模型进行训练后,使其具备基本的语言理解能力和推理能力。并且需要指出的是,本实施例中可以在训练过程中使用海量语料库对模型精度进行评估,当评估指标达到目标值时,则认为模型已经具有足够的理解和推理能力,训练停止。通过维基百科、新闻文章、社交媒体等大规模的语料库对大语言模型预训练和微调,使得大语言模型能够适应不同的应用场景和数据集,实现较好的性能和效果,有助于结合本实施例中的仓储数据库构建仓储管理系统。In this embodiment, an initial large language model for warehouse database management can be constructed, and the large language model can be trained with a pre-set corpus to obtain a target large language model, so that after the large language model is trained with a massive corpus based on the large model training method, it has basic language understanding and reasoning capabilities. It should also be pointed out that in this embodiment, a massive corpus can be used to evaluate the model accuracy during the training process. When the evaluation index reaches the target value, it is considered that the model has sufficient understanding and reasoning capabilities, and the training stops. By pre-training and fine-tuning the large language model with large-scale corpora such as Wikipedia, news articles, and social media, the large language model can adapt to different application scenarios and data sets, achieve better performance and effects, and help to build a warehouse management system in combination with the warehouse database in this embodiment.

步骤S13、基于预设代码生成方法生成所述仓储数据库的操作代码,并将所述目标大语言模型接入所述仓储数据库,以基于所述目标大语言模型利用所述操作代码对所述仓储数据库进行仓储管理。Step S13: Generate an operation code for the warehousing database based on a preset code generation method, and connect the target large language model to the warehousing database, so as to perform warehousing management on the warehousing database based on the target large language model using the operation code.

本实施例中,基于预设的代码生成方法和结构化查询语言构建仓储数据库的操作代码,并将目标大语言模型接入仓储数据库,以基于目标大语言模型利用可访问数据库的操作代码操作仓储数据库,以进行仓储管理。可以理解的是,如果只依赖大语言模型解决仓储物流任务,由于大语言模型具有的短时记忆能力是不可靠的,一方面是记忆的数据量有限,另一方面是长期未使用的信息会被“遗忘”。因此,本实施例中把数据的存储单独做成一个模块,作为大模型的一个知识库,通过将把仓储数据用单独的数据库来存储,并将模型接入数据库,以解决模型的长距离遗忘问题。In this embodiment, the operation code of the warehousing database is constructed based on the preset code generation method and structured query language, and the target large language model is connected to the warehousing database, so as to operate the warehousing database based on the target large language model using the operation code of the accessible database to perform warehousing management. It is understandable that if only relying on the large language model to solve warehousing and logistics tasks, the short-term memory ability of the large language model is unreliable. On the one hand, the amount of data remembered is limited, and on the other hand, information that has not been used for a long time will be "forgotten". Therefore, in this embodiment, the storage of data is made into a separate module as a knowledge base of the large model. By storing the warehousing data in a separate database and connecting the model to the database, the long-distance forgetting problem of the model is solved.

本实施例中首先基于预设仓库信息和预设货物信息创建仓储数据库,以及构建初始大语言模型,并基于预设语料库对初始大语言模型进行训练,得到目标大语言模型,然后基于预设代码生成方法生成仓储数据库的操作代码,并将目标大语言模型接入仓储数据库,以基于目标大语言模型利用操作代码对仓储数据库进行仓储管理。这样一来,基于大语言模型搭建仓储物流系统,基于大模型训练方法使用海量语料库对模型进行训练,使其具备基本的语言理解能力和推理能力后,将大语言模型接入数据库,无需专门的操作界面,相关人员通过使用自然语言与系统进行交流,仓储系统即可自动解决用户的相关需求,免除了程序开发者针对特定功能来改动代码所需的成本,提高了仓储管理的效率以及用户的体验。In this embodiment, a warehousing database is first created based on preset warehouse information and preset goods information, and an initial large language model is constructed. The initial large language model is trained based on a preset corpus to obtain a target large language model. Then, an operation code for the warehousing database is generated based on a preset code generation method, and the target large language model is connected to the warehousing database, so that the warehousing database is managed using the operation code based on the target large language model. In this way, a warehousing logistics system is built based on a large language model, and the model is trained using a massive corpus based on a large model training method so that it has basic language understanding and reasoning capabilities. Then, the large language model is connected to the database. Without a special operation interface, relevant personnel communicate with the system using natural language, and the warehousing system can automatically solve the user's related needs, eliminating the cost required for program developers to modify the code for specific functions, thereby improving the efficiency of warehousing management and the user experience.

基于上一实施例可知,本申请可以基于大语言模型搭建仓储物流系统,并将大语言模型接入数据库,以免除针对特定功能改动代码所需成本,接下来,本实施例中将对基于大语言模型进行仓储管理的过程进行详细地阐述。参见图2所示,本申请实施例公开了一种具体的基于大语言模型的仓储管理方法,包括:Based on the previous embodiment, it can be known that the present application can build a warehouse logistics system based on a large language model, and connect the large language model to the database to avoid the cost of modifying the code for specific functions. Next, the present embodiment will describe in detail the process of warehouse management based on a large language model. As shown in Figure 2, the present application embodiment discloses a specific warehouse management method based on a large language model, including:

步骤S21、基于目标大语言模块构建初始代码生成模型,获取用户输入的目标代码对,并基于所述目标代码对训练所述初始代码生成模型,以根据得到的目标代码模型基于用户指令生成相应的操作代码。Step S21, constructing an initial code generation model based on the target large language module, obtaining a target code pair input by a user, and training the initial code generation model based on the target code pair, so as to generate corresponding operation codes based on the user instructions according to the obtained target code model.

本实施例中在生成仓储数据库的操作代码时,提出一种“少样本半监督的自然语言-代码对比学习”方法,可以让语言模型生成访问数据库的代码以操作数据库,从而实现大语言模型自发对数据库的访问。需要指出的是,本实施例中数据库通常用一句SQL(Structured Query Language,结构化查询语言)代码来访问,例如要访问一个仓库A内的所有货物的名称,可以利用代码“SELECT name FROM inventory WHERE warehouse=’A’”。可以理解的是,目前的大模型经过训练后虽然可以生成代码,但从精确度上来讲,小模型生成的代码往往更精确或包含更少杂乱信息。因此本实施例中为了让大语言模型自发地访问数据库,提出一种如图3所示的“少样本半监督的自然语言-代码对比学习”方法。首先基于目标大语言模块构建初始代码生成模型,获取用户输入的包括代码以及相应的动作的目标代码对,并基于目标代码对训练初始代码生成模型,以根据得到的目标代码模型基于用户指令生成相应的操作代码。In this embodiment, when generating the operation code of the warehouse database, a "few sample semi-supervised natural language-code contrast learning" method is proposed, which allows the language model to generate the code to access the database to operate the database, thereby realizing the spontaneous access of the large language model to the database. It should be pointed out that in this embodiment, the database is usually accessed by a SQL (Structured Query Language) code. For example, to access the names of all goods in a warehouse A, the code "SELECT name FROM inventory WHERE warehouse='A'" can be used. It can be understood that although the current large model can generate code after training, in terms of accuracy, the code generated by the small model is often more accurate or contains less messy information. Therefore, in this embodiment, in order to allow the large language model to spontaneously access the database, a "few sample semi-supervised natural language-code contrast learning" method as shown in Figure 3 is proposed. First, an initial code generation model is constructed based on the target large language module, and a target code pair including a code and a corresponding action input by the user is obtained, and the initial code generation model is trained based on the target code pair to generate the corresponding operation code based on the user instruction according to the obtained target code model.

具体的,在基于目标大语言模块构建初始代码生成模型时,首先在目标大语言模型基础上增加若干个隐藏层,包括代码编码器、代码解码器和语言特征表示的降维层。然后将目标代码对的动作输入至目标大语言模型,得到动作对应的特征,并将该特征通过降维层进行降维得到降维后特征,并且在这个过程中,可以在将目标代码对输入大语言模型之前,对目标代码对进行数据增强,以利用增强后的目标代码对训练初始代码生成模型,例如给定一系列(动作,代码)对,如(“查询仓库A下所有货物”,“SELECT name FROM inventoryWHERE warehouse=’A’”),然后对这些(动作,代码)对进行增强,例如将’A’改为’B’,以得到更多样本,改善模型训练的效果。然后将目标代码对的代码输入至代码编码器,得到代码对应的编码,通过将动作输入到语言模型,得到对应的特征表示;以及将对应的代码输入代码编码器,得到对应的编码经过降维后即得到降维后特征。之后基于对比学习放大根据降维后特征和编码即可确定目标代码对应的第一损失;然后将上述编码输入至代码解码器得到对应的logit值,并基于均方误差函数(mean-square error,MSE)确定上述logit值和实际代码之间的第二损失;之后即可基于预设超参数确定第一损失和第二损失的总损失,并基于总损失确定训练初始代码生成模型得到目标代码模型。Specifically, when constructing the initial code generation model based on the target large language module, firstly, several hidden layers are added on the basis of the target large language model, including a code encoder, a code decoder, and a dimension reduction layer for language feature representation. Then, the action of the target code pair is input into the target large language model to obtain the feature corresponding to the action, and the feature is reduced in dimension through the dimension reduction layer to obtain the reduced feature. In this process, before the target code pair is input into the large language model, the target code pair can be data enhanced to train the initial code generation model with the enhanced target code pair. For example, given a series of (action, code) pairs, such as ("query all goods under warehouse A", "SELECT name FROM inventoryWHERE warehouse='A'"), these (action, code) pairs are enhanced, such as changing 'A' to 'B', to obtain more samples and improve the effect of model training. Then, the code of the target code pair is input into the code encoder to obtain the encoding corresponding to the code, and the corresponding feature representation is obtained by inputting the action into the language model. ; and input the corresponding code into the code encoder to obtain the corresponding code ; After dimensionality reduction, the reduced dimension features are obtained. Then, based on contrastive learning and amplification, the first loss corresponding to the target code can be determined according to the features and encoding after dimensionality reduction. ; Then the above code Input to the code decoder to get the corresponding logit value , and determine the above logit value and actual code based on the mean-square error (MSE) function The second loss between ; Then you can use the preset hyperparameters Determine the total loss of the first loss and the second loss , and based on the total loss, the initial code generation model is trained to obtain the target code model.

本实施例中对于用户的输入,大语言模型会对它进行解析,得到一个中间环节,即“特征表示”,记作;此外,可以搭建一个“输入-编码器-解码器-输出”的用于生成代码的小模型,代码经过编码器得到代码的特征表示,记作经过解码器还原出代码本身。经过降维后得到与相同维度的特征矩阵。取损失函数,损失函数为解码器生成代码的logit与实际之间的MSE损失,为训练目标。需要指出的是,基于该训练目标进行训练时只训练编码器、解码器和降维,不训练大模型。In this embodiment, the large language model will parse the user's input and obtain an intermediate link, namely "feature representation", denoted as In addition, we can build a small model of "input-encoder-decoder-output" for generating code. The code is passed through the encoder to obtain the feature representation of the code, which is recorded as . The code itself is restored through the decoder. After dimensionality reduction, we get Feature matrices of the same dimension . Take the loss function , loss function The MSE loss between the logit of the decoder generated code and the actual, It should be noted that when training based on this training target, only the encoder, decoder and dimensionality reduction are trained, and the large model is not trained.

步骤S22、确定所述用户指令对应的特征,并通过所述目标大语言模型中的预设二分类单元基于所述用户指令对应的特征判断所述用户指令是否为仓储数据库的访问指令,以根据判断结果利用所述操作代码对所述仓储数据库进行仓储管理。Step S22, determine the features corresponding to the user instruction, and determine whether the user instruction is an access instruction to a warehouse database based on the features corresponding to the user instruction by a preset binary classification unit in the target large language model, so as to perform warehouse management on the warehouse database using the operation code according to the judgment result.

本实施例中,可以在大模型中接入二分类头,预测表示该操作是否访问数据库,具体可以通过确定用户指令对应的特征,以及通过目标大语言模型中的预设二分类单元基于用户指令对应的特征判断用户指令是否为仓储数据库的访问指令,以根据判断结果利用操作代码对仓储数据库进行仓储管理。In this embodiment, a binary classification head can be connected to the large model ,predict Indicates whether the operation accesses the database. Specifically, this can be done by determining the features corresponding to the user instructions and judging whether the user instructions are access instructions to the storage database based on the features corresponding to the user instructions by the preset binary classification unit in the target large language model, so as to perform storage management on the storage database using the operation code according to the judgment result.

若用户指令为仓储数据库的访问指令,则利用降维层确定用户指令对应的降维后特征,并利用代码解码器得到用户指令对应的降维后特征的结构化查询代码,以及基于结构化查询代码访问仓储数据库得到访问结果,然后将访问结果输入目标大语言模型,得到对应的访问后特征,并基于访问后特征更新用户指令对应的特征,以将更新后的特征输入目标大语言模型。本实施例中在二分类头预测结果为真时,计算并将其传入解码器,以得到SQL代码访问数据库,将结果输入语言模型,得到数据的特征表示,并令,将叠加后的新用于语言模型中的后续动作。If the user instruction is an access instruction to the storage database, the dimension reduction layer is used to determine the dimension reduction features corresponding to the user instruction, and the code decoder is used to obtain the structured query code of the dimension reduction features corresponding to the user instruction, and the storage database is accessed based on the structured query code to obtain the access result, and then the access result is input into the target large language model to obtain the corresponding post-access features, and the features corresponding to the user instruction are updated based on the post-access features, so as to input the updated features into the target large language model. When the prediction result is true, calculate And pass it into the decoder to get the SQL code to access the database, and input the result into the language model to get the feature representation of the data , and order , the superimposed new Used for subsequent actions in the language model.

本实施例中基于目标大语言模块构建初始代码生成模型,获取用户输入的目标代码对,并基于目标代码对训练初始代码生成模型,以根据得到的目标代码模型基于用户指令生成相应的操作代码,然后确定用户指令对应的特征,并通过目标大语言模型中的预设二分类单元基于用户指令对应的特征判断用户指令是否为仓储数据库的访问指令,以根据判断结果利用操作代码对仓储数据库进行仓储管理。这样一来,基于本实施例提出的少样本自监督的自然语言-代码对比学习方法可以使语言模型自发地生成SQL代码以访问数据库,同时又能确保数据库的结构完整性,并且基于大语言模型来搭建仓储物流系统,通过将大模型为基底的系统应用于仓储物流管理中,在经过海量语料库预训练的、泛化的大模型的基础上,针对仓储物流的基本功能进行微调,提升其回答问题或处理事务的精度,与传统的仓储系统开发相比,不需要通过繁杂的代码编写、调试、测试等复杂的工作,而只需要把需要的新功能“告诉”模型,模型就能为它实现,免除了程序开发者针对特定功能来改动代码所需的成本。In this embodiment, an initial code generation model is constructed based on the target large language module, a target code pair input by the user is obtained, and the initial code generation model is trained based on the target code pair, so as to generate corresponding operation codes based on the user instructions according to the obtained target code model, and then determine the features corresponding to the user instructions, and use the preset binary classification unit in the target large language model to determine whether the user instructions are access instructions to the warehouse database based on the features corresponding to the user instructions, so as to use the operation codes to perform warehouse management on the warehouse database according to the judgment results. In this way, the few-sample self-supervised natural language-code comparison learning method proposed in this embodiment can enable the language model to spontaneously generate SQL code to access the database while ensuring the structural integrity of the database, and build a warehousing and logistics system based on the large language model. By applying the system based on the large model to warehousing and logistics management, on the basis of the generalized large model pre-trained with a massive corpus, the basic functions of warehousing and logistics are fine-tuned to improve the accuracy of answering questions or processing transactions. Compared with traditional warehousing system development, there is no need for complicated work such as complicated code writing, debugging, and testing. Instead, you only need to "tell" the model the required new functions, and the model can implement them, eliminating the cost required for program developers to modify the code for specific functions.

参见图4所示,本申请实施例还公开了一种基于大语言模型的仓储管理装置,包括:As shown in FIG4 , the embodiment of the present application further discloses a warehouse management device based on a large language model, including:

数据库构建模块11,用于基于预设仓库信息和预设货物信息创建仓储数据库;所述仓储数据库的初始状态为空;A database construction module 11 is used to create a storage database based on preset warehouse information and preset goods information; the initial state of the storage database is empty;

模型训练模块12,用于构建初始大语言模型,并基于预设语料库对所述初始大语言模型进行训练,以得到目标大语言模型;A model training module 12 is used to construct an initial large language model and train the initial large language model based on a preset corpus to obtain a target large language model;

仓储管理模块13,用于基于预设代码生成方法生成所述仓储数据库的操作代码,并将所述目标大语言模型接入所述仓储数据库,以基于所述目标大语言模型利用所述操作代码对所述仓储数据库进行仓储管理。The warehouse management module 13 is used to generate an operation code for the warehouse database based on a preset code generation method, and connect the target large language model to the warehouse database, so as to perform warehouse management on the warehouse database based on the target large language model using the operation code.

本实施例首先基于预设仓库信息和预设货物信息创建仓储数据库,以及构建初始大语言模型,并基于预设语料库对所述初始大语言模型进行训练,得到目标大语言模型,然后基于预设代码生成方法生成所述仓储数据库的操作代码,并将所述目标大语言模型接入所述仓储数据库,以基于所述目标大语言模型利用所述操作代码对所述仓储数据库进行仓储管理。通过上述技术方案,可以建构大语言模型并基于大模型训练方法使用海量语料库对模型进行训练,使其具备基本的语言理解能力和推理能力后,将大语言模型接入数据库,无需专门的操作界面,相关人员通过使用自然语言与系统进行交流,仓储系统即可自动解决用户的相关需求,免除了程序开发者针对特定功能来改动代码所需的成本,提高了仓储管理的效率以及用户的体验。This embodiment first creates a warehousing database based on preset warehouse information and preset cargo information, and constructs an initial large language model, and trains the initial large language model based on a preset corpus to obtain a target large language model, and then generates an operation code for the warehousing database based on a preset code generation method, and connects the target large language model to the warehousing database, so as to perform warehousing management on the warehousing database based on the target large language model using the operation code. Through the above technical solution, a large language model can be constructed and trained using a massive corpus based on a large model training method, so that the model has basic language understanding and reasoning capabilities, and then the large language model is connected to the database. Without the need for a special operation interface, relevant personnel communicate with the system using natural language, and the warehousing system can automatically solve the user's related needs, eliminating the cost required for program developers to modify the code for specific functions, and improving the efficiency of warehousing management and user experience.

在一些具体实施例中,所述仓储管理模块13,具体包括:In some specific embodiments, the warehouse management module 13 specifically includes:

模型生成子模块,用于基于所述目标大语言模块构建初始代码生成模型;A model generation submodule, used for constructing an initial code generation model based on the target large language module;

模型训练子模块,用于获取用户输入的目标代码对,并基于所述目标代码对训练所述初始代码生成模型,以根据得到的目标代码模型基于用户指令生成相应的操作代码;其中,所述目标代码对包括用户输入的代码以及相应的动作。The model training submodule is used to obtain the target code pair input by the user, and train the initial code generation model based on the target code pair, so as to generate corresponding operation codes based on user instructions according to the obtained target code model; wherein the target code pair includes the code input by the user and the corresponding action.

在一些具体实施例中,所述模型生成子模块,具体包括:In some specific embodiments, the model generation submodule specifically includes:

模型构建单元,用于在所述目标大语言模型的隐藏层中增加代码编码器、代码解码器和降维层;A model building unit, used for adding a code encoder, a code decoder and a dimensionality reduction layer in the hidden layer of the target large language model;

相应的,所述模型训练子模块,具体包括:Correspondingly, the model training submodule specifically includes:

特征降维单元,用于将所述目标代码对的动作输入至所述目标大语言模型,得到所述动作对应的特征,并将所述特征通过所述降维层进行降维得到降维后特征;A feature dimension reduction unit, used for inputting the action of the target code pair into the target large language model, obtaining features corresponding to the action, and reducing the features through the dimension reduction layer to obtain reduced-dimensional features;

代码编码单元,用于将所述目标代码对的代码输入至所述代码编码器,得到所述代码对应的编码;A code encoding unit, used for inputting the code of the target code pair into the code encoder to obtain a code corresponding to the code;

第一损失确定单元,用于基于所述降维后特征和所述编码确定所述目标代码对应的第一损失;A first loss determining unit, configured to determine a first loss corresponding to the target code based on the dimensionally reduced feature and the encoding;

第二损失确定单元,用于将所述编码输入至所述代码解码器得到对应的logit值,并基于均方误差函数确定所述logit值和所述代码之间的第二损失;A second loss determining unit, configured to input the code into the code decoder to obtain a corresponding logit value, and determine a second loss between the logit value and the code based on a mean square error function;

总损失确定单元,用于基于预设超参数确定所述第一损失和所述第二损失的总损失,并基于所述总损失确定训练所述初始代码生成模型得到所述目标代码模型。A total loss determination unit is used to determine the total loss of the first loss and the second loss based on preset hyperparameters, and to train the initial code generation model to obtain the target code model based on the total loss.

在一些具体实施例中,所述仓储管理模块13,还包括:In some specific embodiments, the warehouse management module 13 further includes:

数据增强单元,用于对所述目标代码对进行数据增强,以利用增强后的所述目标代码对训练所述初始代码生成模型。A data enhancement unit is used to perform data enhancement on the target code pair so as to train the initial code generation model using the enhanced target code pair.

在一些具体实施例中,所述仓储管理模块13,具体包括:In some specific embodiments, the warehouse management module 13 specifically includes:

指令判断单元,用于确定所述用户指令对应的特征,并通过所述目标大语言模型中的预设二分类单元基于所述用户指令对应的特征判断所述用户指令是否为所述仓储数据库的访问指令,以根据判断结果利用所述操作代码对所述仓储数据库进行仓储管理。An instruction judgment unit is used to determine the features corresponding to the user instruction, and judge whether the user instruction is an access instruction of the warehouse database based on the features corresponding to the user instruction through a preset binary classification unit in the target large language model, so as to perform warehouse management on the warehouse database using the operation code according to the judgment result.

在一些具体实施例中,所述仓储管理模块13,具体包括:In some specific embodiments, the warehouse management module 13 specifically includes:

特征确定单元,用于若所述用户指令为所述仓储数据库的所述访问指令,则利用所述降维层确定所述用户指令对应的降维后特征;a feature determination unit, configured to determine, if the user instruction is the access instruction of the storage database, a dimensionally reduced feature corresponding to the user instruction by using the dimension reduction layer;

数据库访问单元,用于利用所述代码解码器得到所述用户指令对应的降维后特征的结构化查询代码,并基于所述结构化查询代码访问所述仓储数据库得到访问结果;A database access unit, used to obtain a structured query code of the dimension-reduced feature corresponding to the user instruction by using the code decoder, and access the storage database based on the structured query code to obtain an access result;

特征更新单元,用于将所述访问结果输入所述目标大语言模型,得到对应的访问后特征,并基于所述访问后特征更新所述用户指令对应的特征,以及将更新后的特征输入所述目标大语言模型。A feature updating unit is used to input the access result into the target large language model to obtain corresponding post-access features, update the features corresponding to the user instructions based on the post-access features, and input the updated features into the target large language model.

进一步的,本申请实施例还公开了一种电子设备,图5是根据一示例性实施例示出的电子设备20结构图,图中的内容不能认为是对本申请的使用范围的任何限制。Furthermore, an embodiment of the present application also discloses an electronic device. FIG5 is a structural diagram of an electronic device 20 according to an exemplary embodiment. The content in the diagram cannot be regarded as any limitation on the scope of use of the present application.

图5为本申请实施例提供的一种电子设备20的结构示意图。该电子设备20,具体可以包括:至少一个处理器21、至少一个存储器22、电源23、通信接口24、输入输出接口25和通信总线26。其中,所述存储器22用于存储计算机程序,所述计算机程序由所述处理器21加载并执行,以实现前述任一实施例公开的基于大语言模型的仓储管理方法中的相关步骤。另外,本实施例中的电子设备20具体可以为电子计算机。FIG5 is a schematic diagram of the structure of an electronic device 20 provided in an embodiment of the present application. The electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input/output interface 25, and a communication bus 26. The memory 22 is used to store a computer program, which is loaded and executed by the processor 21 to implement the relevant steps in the warehouse management method based on a large language model disclosed in any of the aforementioned embodiments. In addition, the electronic device 20 in this embodiment may specifically be an electronic computer.

本实施例中,电源23用于为电子设备20上的各硬件设备提供工作电压;通信接口24能够为电子设备20创建与外界设备之间的数据传输通道,其所遵循的通信协议是能够适用于本申请技术方案的任意通信协议,在此不对其进行具体限定;输入输出接口25,用于获取外界输入数据或向外界输出数据,其具体的接口类型可以根据具体应用需要进行选取,在此不进行具体限定。In this embodiment, the power supply 23 is used to provide working voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and the external device, and the communication protocol it follows is any communication protocol that can be applied to the technical solution of the present application, and is not specifically limited here; the input and output interface 25 is used to obtain external input data or output data to the outside world, and its specific interface type can be selected according to specific application needs and is not specifically limited here.

另外,存储器22作为资源存储的载体,可以是只读存储器、随机存储器、磁盘或者光盘等,其上所存储的资源可以包括操作系统221、计算机程序222等,存储方式可以是短暂存储或者永久存储。In addition, the memory 22, as a carrier for storing resources, can be a read-only memory, a random access memory, a disk or an optical disk, etc. The resources stored thereon can include an operating system 221, a computer program 222, etc., and the storage method can be temporary storage or permanent storage.

其中,操作系统221用于管理与控制电子设备20上的各硬件设备以及计算机程序222,其可以是Windows Server、Netware、Unix、Linux等。计算机程序222除了包括能够用于完成前述任一实施例公开的由电子设备20执行的基于大语言模型的仓储管理方法的计算机程序之外,还可以进一步包括能够用于完成其他特定工作的计算机程序。The operating system 221 is used to manage and control the hardware devices and computer program 222 on the electronic device 20, and can be Windows Server, Netware, Unix, Linux, etc. In addition to the computer program that can be used to complete the warehouse management method based on the large language model executed by the electronic device 20 disclosed in any of the aforementioned embodiments, the computer program 222 can further include computer programs that can be used to complete other specific tasks.

进一步的,本申请还公开了一种计算机可读存储介质,用于存储计算机程序;其中,所述计算机程序被处理器执行时实现前述公开的基于大语言模型的仓储管理方法。关于该方法的具体步骤可以参考前述实施例中公开的相应内容,在此不再进行赘述。Furthermore, the present application also discloses a computer-readable storage medium for storing a computer program; wherein, when the computer program is executed by a processor, the warehouse management method based on a large language model disclosed above is implemented. For the specific steps of the method, reference may be made to the corresponding contents disclosed in the aforementioned embodiments, and no further description will be given here.

本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。In this specification, each embodiment is described in a progressive manner, and each embodiment focuses on the differences from other embodiments. The same or similar parts between the embodiments can be referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant parts can be referred to the method part.

专业人员还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Professionals may further appreciate that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of the two. In order to clearly illustrate the interchangeability of hardware and software, the composition and steps of each example have been generally described in the above description according to function. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Professionals and technicians may use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of this application.

结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。The steps of the method or algorithm described in conjunction with the embodiments disclosed herein may be implemented directly using hardware, a software module executed by a processor, or a combination of the two. The software module may be placed in a random access memory (RAM), a memory, a read-only memory (ROM), an electrically programmable ROM, an electrically erasable programmable ROM, a register, a hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。Finally, it should be noted that, in this article, relational terms such as first and second, etc. are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Moreover, the terms "include", "comprise" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements includes not only those elements, but also other elements not explicitly listed, or also includes elements inherent to such process, method, article or device. In the absence of further restrictions, the elements defined by the sentence "comprise a ..." do not exclude the presence of other identical elements in the process, method, article or device including the elements.

以上对本申请所提供的技术方案进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。The technical solution provided by the present application is introduced in detail above. Specific examples are used in this article to illustrate the principles and implementation methods of the present application. The description of the above embodiments is only used to help understand the method of the present application and its core idea. At the same time, for general technical personnel in this field, according to the idea of the present application, there will be changes in the specific implementation method and application scope. In summary, the content of this specification should not be understood as a limitation on the present application.

Claims (8)

1.一种基于大语言模型的仓储管理方法,其特征在于,包括:1. A warehouse management method based on a large language model, characterized by comprising: 基于预设仓库信息和预设货物信息创建仓储数据库;所述仓储数据库的初始状态为空;Creating a storage database based on preset warehouse information and preset cargo information; the initial state of the storage database is empty; 构建初始大语言模型,并基于预设语料库对所述初始大语言模型进行训练,以得到目标大语言模型;Constructing an initial large language model, and training the initial large language model based on a preset corpus to obtain a target large language model; 基于预设代码生成方法生成所述仓储数据库的操作代码,并将所述目标大语言模型接入所述仓储数据库,以基于所述目标大语言模型利用所述操作代码对所述仓储数据库进行仓储管理;Generate an operation code for the storage database based on a preset code generation method, and connect the target large language model to the storage database, so as to perform storage management on the storage database using the operation code based on the target large language model; 其中,所述基于预设代码生成方法生成所述仓储数据库的操作代码,包括:The step of generating the operation code of the warehouse database based on a preset code generation method includes: 基于所述目标大语言模型构建初始代码生成模型;Building an initial code generation model based on the target large language model; 获取用户输入的目标代码对,并基于所述目标代码对训练所述初始代码生成模型,以根据得到的目标代码模型基于用户指令生成相应的操作代码;Obtaining a target code pair input by a user, and training the initial code generation model based on the target code pair, so as to generate a corresponding operation code based on the user instruction according to the obtained target code model; 其中,所述目标代码对包括用户输入的代码以及相应的动作;Wherein, the target code pair includes a code input by a user and a corresponding action; 并且,所述基于所述目标大语言模型构建初始代码生成模型,包括:Furthermore, constructing an initial code generation model based on the target large language model includes: 在所述目标大语言模型的隐藏层中增加代码编码器、代码解码器和降维层;Adding a code encoder, a code decoder and a dimensionality reduction layer to the hidden layer of the target large language model; 相应的,所述获取用户输入的目标代码对,并基于所述目标代码对训练所述初始代码生成模型,包括:Accordingly, the step of obtaining the target code pair input by the user and training the initial code generation model based on the target code pair includes: 将所述目标代码对的动作输入至所述目标大语言模型,得到所述动作对应的特征,并将所述特征通过所述降维层进行降维得到降维后特征;Inputting the action of the target code pair into the target large language model to obtain features corresponding to the action, and reducing the dimensions of the features through the dimension reduction layer to obtain reduced-dimensional features; 将所述目标代码对的代码输入至所述代码编码器,得到所述代码对应的编码;Inputting the code of the target code pair into the code encoder to obtain the code corresponding to the code; 基于所述降维后特征和所述编码确定所述目标代码对应的第一损失;Determine a first loss corresponding to the target code based on the dimensionally reduced features and the encoding; 将所述编码输入至所述代码解码器得到对应的logit值,并基于均方误差函数确定所述logit值和所述代码之间的第二损失;Inputting the code into the code decoder to obtain a corresponding logit value, and determining a second loss between the logit value and the code based on a mean square error function; 基于预设超参数确定所述第一损失和所述第二损失的总损失,并基于所述总损失确定训练所述初始代码生成模型得到所述目标代码模型。The total loss of the first loss and the second loss is determined based on preset hyperparameters, and the initial code generation model is trained to obtain the target code model based on the total loss. 2.根据权利要求1所述的基于大语言模型的仓储管理方法,其特征在于,所述操作代码为基于结构化查询语言构建的代码。2. According to the warehouse management method based on a large language model in claim 1, it is characterized in that the operation code is a code constructed based on structured query language. 3.根据权利要求1所述的基于大语言模型的仓储管理方法,其特征在于,所述获取用户输入的目标代码对之后,还包括:3. The warehouse management method based on a large language model according to claim 1, characterized in that after obtaining the target code pair input by the user, it also includes: 对所述目标代码对进行数据增强,以利用增强后的所述目标代码对训练所述初始代码生成模型。Data enhancement is performed on the target code pair to train the initial code generation model using the enhanced target code pair. 4.根据权利要求1所述的基于大语言模型的仓储管理方法,其特征在于,所述基于所述目标大语言模型利用所述操作代码对所述仓储数据库进行仓储管理时,包括:4. The warehouse management method based on a large language model according to claim 1, characterized in that when the warehouse management of the warehouse database is performed based on the target large language model using the operation code, it includes: 确定所述用户指令对应的特征,并通过所述目标大语言模型中的预设二分类单元基于所述用户指令对应的特征判断所述用户指令是否为所述仓储数据库的访问指令,以根据判断结果利用所述操作代码对所述仓储数据库进行仓储管理。Determine the features corresponding to the user instruction, and determine whether the user instruction is an access instruction to the warehouse database based on the features corresponding to the user instruction through a preset binary classification unit in the target large language model, so as to perform warehouse management on the warehouse database using the operation code according to the judgment result. 5.根据权利要求4所述的基于大语言模型的仓储管理方法,其特征在于,所述根据判断结果对所述仓储数据库进行仓储管理,包括:5. The warehouse management method based on a large language model according to claim 4 is characterized in that the warehouse management of the warehouse database according to the judgment result comprises: 若所述用户指令为所述仓储数据库的所述访问指令,则利用所述降维层确定所述用户指令对应的降维后特征;If the user instruction is the access instruction of the storage database, the dimension reduction layer is used to determine the dimension-reduced features corresponding to the user instruction; 利用所述代码解码器得到所述用户指令对应的降维后特征的结构化查询代码,并基于所述结构化查询代码访问所述仓储数据库得到访问结果;Using the code decoder to obtain a structured query code of the dimension-reduced feature corresponding to the user instruction, and accessing the storage database based on the structured query code to obtain an access result; 将所述访问结果输入所述目标大语言模型,得到对应的访问后特征,并基于所述访问后特征更新所述用户指令对应的特征,以及将更新后的特征输入所述目标大语言模型。The access result is input into the target large language model to obtain corresponding post-access features, and the features corresponding to the user instructions are updated based on the post-access features, and the updated features are input into the target large language model. 6.一种基于大语言模型的仓储管理装置,其特征在于,包括:6. A warehouse management device based on a large language model, characterized by comprising: 数据库构建模块,用于基于预设仓库信息和预设货物信息创建仓储数据库;所述仓储数据库的初始状态为空;A database construction module, used to create a storage database based on preset warehouse information and preset goods information; the initial state of the storage database is empty; 模型训练模块,用于构建初始大语言模型,并基于预设语料库对所述初始大语言模型进行训练,以得到目标大语言模型;A model training module, used to construct an initial large language model and train the initial large language model based on a preset corpus to obtain a target large language model; 仓储管理模块,用于基于预设代码生成方法生成所述仓储数据库的操作代码,并将所述目标大语言模型接入所述仓储数据库,以基于所述目标大语言模型利用所述操作代码对所述仓储数据库进行仓储管理;A warehouse management module, used to generate an operation code for the warehouse database based on a preset code generation method, and connect the target large language model to the warehouse database, so as to perform warehouse management on the warehouse database using the operation code based on the target large language model; 其中,所述仓储管理模块,具体包括:Wherein, the warehouse management module specifically includes: 模型生成子模块,用于基于所述目标大语言模型构建初始代码生成模型;A model generation submodule, used to construct an initial code generation model based on the target large language model; 模型训练子模块,用于获取用户输入的目标代码对,并基于所述目标代码对训练所述初始代码生成模型,以根据得到的目标代码模型基于用户指令生成相应的操作代码;其中,所述目标代码对包括用户输入的代码以及相应的动作;A model training submodule, used to obtain a target code pair input by a user, and train the initial code generation model based on the target code pair, so as to generate a corresponding operation code based on the user instruction according to the obtained target code model; wherein the target code pair includes a code input by the user and a corresponding action; 并且,所述模型生成子模块,具体包括:Furthermore, the model generation submodule specifically includes: 模型构建单元,用于在所述目标大语言模型的隐藏层中增加代码编码器、代码解码器和降维层;A model building unit, used for adding a code encoder, a code decoder and a dimensionality reduction layer in the hidden layer of the target large language model; 相应的,所述模型训练子模块,具体包括:Correspondingly, the model training submodule specifically includes: 特征降维单元,用于将所述目标代码对的动作输入至所述目标大语言模型,得到所述动作对应的特征,并将所述特征通过所述降维层进行降维得到降维后特征;A feature dimension reduction unit, used for inputting the action of the target code pair into the target large language model, obtaining features corresponding to the action, and reducing the features through the dimension reduction layer to obtain reduced-dimensional features; 代码编码单元,用于将所述目标代码对的代码输入至所述代码编码器,得到所述代码对应的编码;A code encoding unit, used for inputting the code of the target code pair into the code encoder to obtain a code corresponding to the code; 第一损失确定单元,用于基于所述降维后特征和所述编码确定所述目标代码对应的第一损失;A first loss determining unit, configured to determine a first loss corresponding to the target code based on the dimensionally reduced feature and the encoding; 第二损失确定单元,用于将所述编码输入至所述代码解码器得到对应的logit值,并基于均方误差函数确定所述logit值和所述代码之间的第二损失;A second loss determining unit, configured to input the code into the code decoder to obtain a corresponding logit value, and determine a second loss between the logit value and the code based on a mean square error function; 总损失确定单元,用于基于预设超参数确定所述第一损失和所述第二损失的总损失,并基于所述总损失确定训练所述初始代码生成模型得到所述目标代码模型。A total loss determination unit is used to determine the total loss of the first loss and the second loss based on preset hyperparameters, and to train the initial code generation model to obtain the target code model based on the total loss. 7.一种电子设备,其特征在于,所述电子设备包括处理器和存储器;其中,所述存储器用于存储计算机程序,所述计算机程序由所述处理器加载并执行以实现如权利要求1至5任一项所述的基于大语言模型的仓储管理方法。7. An electronic device, characterized in that the electronic device comprises a processor and a memory; wherein the memory is used to store a computer program, and the computer program is loaded and executed by the processor to implement the warehouse management method based on a large language model as described in any one of claims 1 to 5. 8.一种计算机可读存储介质,其特征在于,用于保存计算机程序,所述计算机程序被处理器执行时实现如权利要求1至5任一项所述的基于大语言模型的仓储管理方法。8. A computer-readable storage medium, characterized in that it is used to store a computer program, and when the computer program is executed by a processor, it implements the warehouse management method based on a large language model as described in any one of claims 1 to 5.
CN202410817209.0A 2024-06-24 2024-06-24 Warehouse management method, device, equipment and medium based on large language model Active CN118428858B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410817209.0A CN118428858B (en) 2024-06-24 2024-06-24 Warehouse management method, device, equipment and medium based on large language model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410817209.0A CN118428858B (en) 2024-06-24 2024-06-24 Warehouse management method, device, equipment and medium based on large language model

Publications (2)

Publication Number Publication Date
CN118428858A CN118428858A (en) 2024-08-02
CN118428858B true CN118428858B (en) 2024-09-20

Family

ID=92323506

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410817209.0A Active CN118428858B (en) 2024-06-24 2024-06-24 Warehouse management method, device, equipment and medium based on large language model

Country Status (1)

Country Link
CN (1) CN118428858B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118628025B (en) * 2024-08-15 2024-11-05 辽宁金晟科技股份有限公司 Storage goods digital high-efficiency storage method based on NLP technology

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114691718A (en) * 2022-03-29 2022-07-01 中国工商银行股份有限公司 Query statement generation method and device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115658729A (en) * 2022-11-02 2023-01-31 广东工业大学 A method of converting natural language to SQL statements based on pre-trained models
CN115587157A (en) * 2022-11-03 2023-01-10 科大讯飞股份有限公司 SQL statement generation method and device, electronic equipment and storage medium
CN117573822A (en) * 2023-11-15 2024-02-20 Oppo广东移动通信有限公司 Human-computer interaction methods, devices, equipment and storage media
CN117708339B (en) * 2024-02-05 2024-04-23 中南大学 ICD automatic coding method based on pre-training language model

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114691718A (en) * 2022-03-29 2022-07-01 中国工商银行股份有限公司 Query statement generation method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于无线射频识别技术的冷冻冷藏食品物流仓储管理系统;刘鹤;食品与机械;20160131;第32卷(第1期);第121-124页 *

Also Published As

Publication number Publication date
CN118428858A (en) 2024-08-02

Similar Documents

Publication Publication Date Title
RU2408074C2 (en) Method, system and apparatus for providing access to workbook models through remote function calls
KR20210106398A (en) Conversation-based recommending method, conversation-based recommending apparatus, and device
CN114090755B (en) Reply sentence determining method and device based on knowledge graph and electronic equipment
CN112256886B (en) Probability calculation method and device in atlas, computer equipment and storage medium
US10552426B2 (en) Adaptive conversational disambiguation system
CN118227655A (en) Database query statement generation method, device, equipment and storage medium
CN118428858B (en) Warehouse management method, device, equipment and medium based on large language model
US11688393B2 (en) Machine learning to propose actions in response to natural language questions
CN113569017A (en) Model processing method and device, electronic equipment and storage medium
WO2025106226A1 (en) Context-based prompt generation for automated translations between natural language and query language
JP2023002475A (en) Computer system, computer program and computer-implemented method (causal knowledge identification and extraction)
CN118627546A (en) Methods and equipment for model training and data processing
US20230070966A1 (en) Method for processing question, electronic device and storage medium
US20250181622A1 (en) Method and system for dialogue data generation and processing
JP2025081252A (en) Interaction method, interaction apparatus, electronic device, storage medium, and computer program
CN118964557A (en) Question answer generation method, device, electronic device and storage medium
CN113110843B (en) Contract generation model training method, contract generation method and electronic equipment
CN120045646A (en) Text processing method, text processing device, computer equipment, storage medium and program product
CN118569650B (en) A method for constructing a small and micro risk control model, an application method and a system
CN118446654B (en) Process approval method and system based on three-dimensional data items
CN120030345A (en) Training data generation method and related equipment
CN118673135A (en) Output control method and device for target object description information
Soffer et al. Reusability of conceptual models: The problem of model variations
HK40056469A (en) Model processing method and apparatus, electronic device and storage medium
CN120470097A (en) Question-answering method, training method, device, agent, equipment and medium based on large model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant