[go: up one dir, main page]

WO2023213270A1 - 模型训练处理方法、装置、终端及网络侧设备 - Google Patents

模型训练处理方法、装置、终端及网络侧设备 Download PDF

Info

Publication number
WO2023213270A1
WO2023213270A1 PCT/CN2023/092028 CN2023092028W WO2023213270A1 WO 2023213270 A1 WO2023213270 A1 WO 2023213270A1 CN 2023092028 W CN2023092028 W CN 2023092028W WO 2023213270 A1 WO2023213270 A1 WO 2023213270A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
model
information
tagged
case
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2023/092028
Other languages
English (en)
French (fr)
Inventor
孙布勒
杨昂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Vivo Mobile Communication Co Ltd
Original Assignee
Vivo Mobile Communication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vivo Mobile Communication Co Ltd filed Critical Vivo Mobile Communication Co Ltd
Priority to EP23799256.5A priority Critical patent/EP4521309A4/en
Publication of WO2023213270A1 publication Critical patent/WO2023213270A1/zh
Priority to US18/935,694 priority patent/US20250061381A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/02Arrangements for optimising operational condition
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0499Feedforward networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/098Distributed learning, e.g. federated learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks

Definitions

  • This application belongs to the field of communication technology, and specifically relates to a model training processing method, device, terminal and network side equipment.
  • AI Artificial Intelligence
  • Embodiments of the present application provide a model training processing method, device, terminal and network side equipment, which improves the reliability of AI-based wireless communication.
  • the first aspect provides a model training processing method, including:
  • the first device acquires first information, where the first information includes first data
  • the first device uses the first model to process the first data to obtain the second data;
  • both the first data and the second data can be used to train a second model
  • the second model is a business model
  • the second data satisfies at least one of the following: when the first data is not In the case of tagged data, the second data is tagged data; in the case of the first data being tagged data, the data amount of the second data is greater than the data amount of the first data.
  • a model training processing method including:
  • the second device sends second information to the first device, the second information includes a first model, the first model is used by the first device to obtain second data based on the first data;
  • both the first data and the second data can be used to train a second model
  • the second model is a business model
  • the second data satisfies at least one of the following: when the first data is not In the case of tagged data, the second data is tagged data; in the case of the first data being tagged data, the number of the second data is The data amount is greater than the data amount of the first data.
  • a model training processing device including:
  • An acquisition module configured to acquire first information, where the first information includes first data
  • a processing module used to process the first data using the first model to obtain the second data
  • both the first data and the second data can be used to train a second model
  • the second model is a business model
  • the second data satisfies at least one of the following: when the first data is not In the case of tagged data, the second data is tagged data; in the case of the first data being tagged data, the data amount of the second data is greater than the data amount of the first data.
  • a model training processing device including:
  • a second sending module configured to send second information to the first device, where the second information includes a first model, and the first model is used by the first device to obtain second data based on the first data;
  • both the first data and the second data can be used to train a second model
  • the second model is a business model
  • the second data satisfies at least one of the following: when the first data is not In the case of tagged data, the second data is tagged data; in the case of the first data being tagged data, the data amount of the second data is greater than the data amount of the first data.
  • a model training processing device including:
  • a second sending module configured to send second information to the first device, where the second information includes a first model, and the first model is used by the first device to obtain second data based on the first data;
  • both the first data and the second data can be used to train a second model
  • the second model is a business model
  • the second data satisfies at least one of the following: when the first data is not In the case of tagged data, the second data is tagged data; in the case of the first data being tagged data, the data amount of the second data is greater than the data amount of the first data.
  • a model training processing device including:
  • a third sending module configured to send first information to the first device, where the first information includes first data, and the first data is used by the first device to obtain second data based on the first model;
  • both the first data and the second data can be used to train a second model
  • the second model is a business model
  • the second data satisfies at least one of the following: when the first data is not In the case of tagged data, the second data is tagged data; in the case of the first data being tagged data, the data amount of the second data is greater than the data amount of the first data.
  • a terminal in a seventh aspect, includes a processor and a memory.
  • the memory stores programs or instructions that can be run on the processor.
  • the program or instructions When the program or instructions are executed by the processor, the following implementations are implemented: The steps of the method described in one aspect.
  • a terminal including a processor and a communication interface, wherein,
  • the processor is configured to obtain first information, where the first information includes first data; process the first data using a first model to obtain second data; wherein, the first data and the second data can be used to train a second model, the second model is a business model, and the second data satisfies at least one of the following: when the first data is unlabeled data, the second data is tagged data; in the case where the first data is tagged data, the data amount of the second data is greater than the data amount of the first data;
  • the communication interface is used to send second information to the first device, where the second information includes a first model, and the first model is used by the first device based on the first
  • the second data is obtained from the data; wherein both the first data and the second data can be used to train a second model, the second model is a business model, and the second data satisfies at least one of the following: When the first data is untagged data, the second data is tagged data; when the first data is tagged data, the data amount of the second data is greater than the data amount. The data amount of the first data.
  • the communication interface is used to send first information to the first device, the first information includes first data, and the first data is used by the first device based on the first
  • the model obtains second data; wherein both the first data and the second data can be used to train a second model, the second model is a business model, and the second data satisfies at least one of the following: When the first data is untagged data, the second data is tagged data; when the first data is tagged data, the data amount of the second data is greater than the data amount. The data amount of the first data.
  • a network side device in a ninth aspect, includes a processor and a memory.
  • the memory stores programs or instructions that can be run on the processor.
  • the program or instructions are executed by the processor.
  • a network side device including a processor and a communication interface, wherein,
  • the processor is configured to obtain first information, where the first information includes first data; process the first data using a first model to obtain second data; wherein, Both the first data and the second data can be used to train a second model, the second model is a business model, and the second data satisfies at least one of the following: the first data is unlabeled data.
  • the second data is tagged data; in the case of the first data being tagged data, the data amount of the second data is greater than the data amount of the first data;
  • the communication interface is used to send second information to the first device, where the second information includes a first model, and the first model is used by the first device based on The first data obtains the second data; wherein both the first data and the second data can be used to train a second model, the second model is a business model, and the second data satisfies at least one of the following: When the first data is untagged data, the second data is tagged data; when the first data is tagged data, the data amount of the second data is greater than The data amount of the first data.
  • the communication interface is used to send first information to the first device, the first information includes first data, and the first data is used by the first device based on The first model obtains the second data; wherein both the first data and the second data can be used to train the second model, the second model is a business model, and the second data satisfies at least one of the following: When the first data is untagged data, the second data is tagged data; when the first data is tagged data, the data amount of the second data is greater than the data amount of the first data.
  • a readable storage medium is provided. Programs or instructions are stored on the readable storage medium. When the programs or instructions are executed by a processor, the steps of the method described in the first aspect are implemented, or the steps of the method are implemented. The steps of the method as described in the second aspect.
  • a chip in a twelfth aspect, includes a processor and a communication interface.
  • the communication interface is coupled to the processor.
  • the processor is used to run programs or instructions to implement the method described in the first aspect. The steps of a method, or steps of implementing a method as described in the second aspect.
  • a computer program/program product is provided, the computer program/program product is stored in a storage medium, and the computer program/program product is executed by at least one processor to implement as described in the first aspect The steps of the method, or the steps of implementing the method as described in the second aspect.
  • the first information is obtained through the first device, and the first information includes first data; the first device uses the first model to process the first data to obtain the second data; wherein, the Both the first data and the second data can be used to train a second model, the second model is a business model, and the second data satisfies at least one of the following: the first data is unlabeled data.
  • the second data is tagged data; in the case of the first data being tagged data, the data amount of the second data is greater than the data amount of the first data.
  • the embodiments of the present application can improve the reliability of AI-based wireless communication.
  • Figure 1 is a structural diagram of a network system applicable to the embodiment of the present application.
  • Figure 2 is a schematic structural diagram of a neuron
  • Figure 3 is one of the flow charts of a model training processing method provided by an embodiment of the present application.
  • Figure 4 is the second flow chart of a model training processing method provided by the embodiment of the present application.
  • Figure 5 is the third flow chart of a model training processing method provided by the embodiment of the present application.
  • Figure 6 is the fourth flow chart of a model training processing method provided by the embodiment of the present application.
  • Figure 7 is the fifth flow chart of a model training processing method provided by an embodiment of the present application.
  • Figure 8 is a flow chart 6 of a model training processing method provided by an embodiment of the present application.
  • Figure 9 is one of the structural diagrams of a model training processing device provided by an embodiment of the present application.
  • Figure 10 is the second structural diagram of a model training processing device provided by an embodiment of the present application.
  • Figure 11 is the third structural diagram of a model training processing device provided by an embodiment of the present application.
  • Figure 12 is a structural diagram of a communication device provided by an embodiment of the present application.
  • Figure 13 is a structural diagram of a terminal provided by an embodiment of the present application.
  • Figure 14 is a structural diagram of another network-side device provided by an embodiment of the present application.
  • first, second, etc. in the description and claims of this application are used to distinguish similar objects and are not used to describe a specific order or sequence. It is to be understood that the terms so used are interchangeable under appropriate circumstances so that the embodiments of the present application can be practiced in sequences other than those illustrated or described herein, and that "first" and “second” are distinguished objects It is usually one type, and the number of objects is not limited.
  • the first object can be one or multiple.
  • “and/or” in the description and claims indicates at least one of the connected objects, and the character “/" generally indicates that the related objects are in an "or” relationship.
  • LTE Long Term Evolution
  • LTE-Advanced, LTE-A Long Term Evolution
  • LTE-A Long Term Evolution
  • CDMA Code Division Multiple Access
  • TDMA Time Division Multiple Access
  • FDMA Frequency Division Multiple Access
  • OFDMA Orthogonal Frequency Division Multiple Access
  • SC-FDMA Single-carrier Frequency Division Multiple Access
  • NR New Radio
  • FIG. 1 shows a block diagram of a wireless communication system to which embodiments of the present application are applicable.
  • the wireless communication system includes a terminal 11 and a network side device 12.
  • the terminal 11 may be a mobile phone, a tablet computer (Tablet Personal Computer), a laptop computer (Laptop Computer), or a notebook computer, a personal digital assistant (Personal Digital Assistant, PDA), a palmtop computer, a netbook, or a super mobile personal computer.
  • Tablet Personal Computer Tablet Personal Computer
  • laptop computer laptop computer
  • PDA Personal Digital Assistant
  • PDA Personal Digital Assistant
  • UMPC ultra-mobile personal computer
  • UMPC mobile Internet device
  • MID mobile Internet Device
  • AR augmented reality
  • VR virtual reality
  • robots wearable devices
  • WUE Vehicle User Equipment
  • PUE Pedestrian User Equipment
  • smart home home equipment with wireless communication functions, such as refrigerators, TVs, washing machines or furniture, etc.
  • game consoles personal computers (personal computer, PC), teller machine or self-service machine and other terminal-side devices.
  • Wearable devices include: smart watches, smart bracelets, smart headphones, smart glasses, smart jewelry (smart bracelets, smart bracelets, smart rings, smart necklaces, smart anklets) bracelets, smart anklets, etc.), smart wristbands, smart clothing, etc.
  • the network side device 12 may include an access network device or a core network device, where the access network device may also be called a radio access network device, a radio access network (Radio Access Network, RAN), a radio access network function or a wireless access network unit.
  • Access network equipment may include a base station, a Wireless Local Area Network (WLAN) access point or a WiFi node, etc.
  • WLAN Wireless Local Area Network
  • the base station may be called a Node B, an Evolved Node B (eNB), an access point, a base transceiver station ( Base Transceiver Station (BTS), radio base station, radio transceiver, Basic Service Set (BSS), Extended Service Set (ESS), home B-node, home evolved B-node, transmitting and receiving point ( Transmitting Receiving Point (TRP) or some other appropriate terminology in the field, as long as the same technical effect is achieved, the base station is not limited to specific technical terms. It should be noted that in the embodiment of this application, only in the NR system The base station is introduced as an example, and the specific type of base station is not limited.
  • AI modules such as neural networks, decision trees, support vector machines, Bayesian classifiers, etc. This application takes neural network as an example for explanation, but does not limit the specific type of AI module.
  • Neural networks are composed of neurons, and the schematic diagram of neurons is shown in Figure 2. in, a i is the input, w is the weight (multiplicative coefficient), b is the bias (additive coefficient), and ⁇ (.) is the activation function. Common activation functions include Sigmoid, hyperbolic tangent function (tanh), linear rectification function ReLU (Rectified Linear Unit), etc.
  • the parameters of the neural network are optimized through optimization algorithms.
  • An optimization algorithm is a type of algorithm that can help us minimize or maximize an objective function (sometimes also called a loss function).
  • the objective function is often a mathematical combination of model parameters and data. For example, given data X and its corresponding label Y, we build a neural network model f(.). With the model, we can get the predicted output f(x) based on the input The difference between (f(x)-Y), this is the loss function. Our purpose is to find the appropriate W and b to minimize the value of the above loss function. The smaller the loss value, the closer our model is to the real situation.
  • BP error back propagation
  • the basic idea of BP algorithm is that the learning process consists of two processes: forward propagation of signals and back propagation of errors.
  • the input sample is passed in from the input layer, processed layer by layer by each hidden layer, and then transmitted to the output layer. If the actual output of the output layer does not match the expected output, it will enter the error backpropagation stage.
  • Error backpropagation is to backpropagate the output error in some form to the input layer layer by layer through the hidden layer, and allocate the error to all units in each layer, thereby obtaining the error signal of each layer unit. This error signal is used as a correction for each unit. The basis for the weight.
  • This process of adjusting the weights of each layer in forward signal propagation and error back propagation is carried out over and over again.
  • the process of continuous adjustment of weights is the learning and training process of the network. This process continues until the error of the network output is reduced to an acceptable level, or until a preset number of learning times.
  • the AI algorithm selected and the model used are also different.
  • the main way to use AI to improve 5G network performance is to enhance or replace existing algorithms or processing modules with neural network-based algorithms and models.
  • neural network-based algorithms and models can achieve better performance than deterministic-based algorithms.
  • the more commonly used neural networks include deep neural networks, convolutional neural networks, and recurrent neural networks. With the help of existing AI tools, the construction, training and verification of neural networks can be achieved.
  • the training of AI models requires the support of a large amount of data. If the amount of data is insufficient, the model training process may not converge, or the trained model may be overfitted. However, in many scenarios in wireless communication, tagged data cannot be obtained, or the amount of tagged data is small (due to collection overhead, transmission overhead, etc.). Therefore, it is necessary to solve the problem of model training when there is insufficient labeled data in wireless communications. To this end, the model training processing method of this application is proposed.
  • this embodiment of the present application provides a model training processing method, including:
  • Step 301 The first device obtains first information, where the first information includes first data;
  • Step 302 The first device uses the first model to process the first data to obtain the second data;
  • both the first data and the second data can be used to train a second model
  • the second model is a business model
  • the second data satisfies at least one of the following: when the first data is not In the case of tagged data, the second data is tagged data; in the case of the first data being tagged data, the data amount of the second data is greater than the data amount of the first data.
  • the first device may be a network-side device or a terminal, and the first data may be at least part of the data used to train the second model.
  • the first data may be tagged data or untagged data.
  • the above-mentioned first model can be understood as a model that enhances the training data of the second model. For example, when the first data is labeled data, the above-mentioned first model is used to expand the first data, thereby obtaining a larger amount of data. many second data; when the first data is unlabeled data, the above-mentioned first model is used to label the first data, so that more labeled training data can be obtained.
  • the first data is labeled N data
  • M labeled data can be output (that is, the second data is labeled M data)
  • M is greater than N.
  • M is much greater than N.
  • the first data is M pieces of data without labels
  • M pieces of labeled data can be output (that is, the second data is M pieces of data with labels). M data of the label).
  • the above-mentioned first information may be stored in the first device or in the second device.
  • the above-mentioned first model can be stored in the first device or in the second device, and the second device can send it to the first device.
  • the first device when the first device is a core network device, the second device can be a base station; when the first device is a base station (such as base station A), the second device can be a base station (such as base station B) or a terminal; when the first device When it is a terminal (such as terminal A), the second device can be a base station or a terminal (such as terminal B).
  • the above-mentioned first information and first model are stored in different devices.
  • the first information is obtained through the first device, and the first information includes first data; the first device uses the first model to process the first data to obtain the second data; wherein, the Both the first data and the second data can be used to train a second model, the second model is a business model, and the second data satisfies at least one of the following: the first data is unlabeled data.
  • the second data is tagged data; in the case of the first data being tagged data, the data amount of the second data is greater than the data amount of the first data.
  • the embodiments of the present application can improve the reliability of AI-based wireless communication.
  • the first model is stored in the second device, and before the first device uses the first model to process the first data, the method further includes:
  • the first device receives second information from a second device, the second information including the first model.
  • the above-mentioned second information including the first model can be understood as: the second information includes parameters of the first model or includes address information of the first model, so that the first device can obtain the first model.
  • the second information also includes at least one of the configuration information and the first auxiliary information.
  • the configuration information is used to indicate how the first model is used
  • the first auxiliary information includes statistical information and environmental information required for the operation of the first model
  • the statistical information is used to represent Distribution characteristics of the input of the first model.
  • the above configuration information is used to indicate the usage method of the first model.
  • it may include the data dimensions or input data format, output dimensions or output data format, input data volume, output data volume, etc. of the first model.
  • the above-mentioned environmental information can be understood as environmental information related to the data enhancement algorithm of the first model.
  • the environmental information can include the software environment and hardware environment required for model operation.
  • it can include the software architecture, hardware architecture, power requirements, etc. that need to be used. Storage requirements and computing power requirements, etc.
  • the above statistical information may include distribution characteristic information such as mean and variance of the model input.
  • the method before the first device receives the second information from the second device, the method further includes:
  • the first device sends a first request message to the second device, where the first request message is used to request acquisition of the second information.
  • the first device when it needs to expand the first data, it can obtain the above-mentioned second information through a request, thereby improving the pertinence of obtaining the second information.
  • the second device can also actively send the second information to the first device. For example, when the second device establishes a connection with the first device, the second device sends the second information to the first device.
  • the second information can also be broadcast by the second device, and when the first device needs the second information, it can be directly obtained from the broadcast information.
  • the first device uses the first model to process the first data, and after obtaining the second data, the method further includes:
  • the first device trains the second model based on the second data to obtain a third model.
  • the first device can train the second model to obtain the third model.
  • the third model can be used on the first device or on the second device.
  • the above-mentioned second model may be sent by the second device to the first device, or may be a protocol pre-configured on the first device, which is not further limited here.
  • the first device trains the second model based on the second data.
  • the third methods also include:
  • the first device sends the third model to the second device.
  • sending the third model can be understood as sending the parameters of the third model or sending the parameters of the third model. Address information. No further qualifications are made here. In this way, when the trained third model is used to perform inference of the corresponding business, the accuracy of the inference can be improved, thereby ensuring the reliability of communication.
  • the initial training data of the above-mentioned second model may include the above-mentioned first data, and may also include labeled third data.
  • the training data may include first data, second data and third data; if the first data is unlabeled data, when training the second model, the training data may include Second data and third data.
  • the first device obtaining the first information includes any of the following:
  • the first device receives first information from the second device
  • the first device obtains first information locally.
  • the first device when the first information is stored in the second device, the first device can receive the first information from the second device; when the first information is stored in the first device, the first device can obtain it locally. First information.
  • the method before the first device receives the first information from the second device, the method further includes:
  • the first device sends instruction information to the second device, where the instruction information is used to instruct the second device to send the first information.
  • the first information is stored in the second device, and the first device needs to instruct the second device to send the first information.
  • the first device can schedule the second device to send the first information.
  • the method before the first device receives the first information from the second device, the method further includes:
  • the first device receives a second request message from the second device, and the second request message is used for the second device to request to send the first information.
  • the second device may request the first device to send the first information. After that, the second device may send the first information on preconfigured resources. , the first device may also dynamically schedule the second device to send the first information, for example, instruct the second device to send the first information through the above instruction information.
  • the first device when the first device receives the first information from the second device, the first device uses the first model to process the first data, and after obtaining the second data, the method further includes:
  • the first device sends third information to the second device, where the third information includes the second data.
  • the training process of the second model is performed by the second device, then the first device needs to The second device sends the second data for the second device to train the second model to obtain the third model.
  • the training of the second model by the second device is similar to the training of the second model by the first device.
  • the definition of the training data can refer to the above example, which will not be described again here.
  • the third information also includes identification information, and the identification information is used to indicate that the second data is obtained based on the first model.
  • the first information also includes second auxiliary information, and the second auxiliary information is used to represent the distribution characteristics of the first data.
  • the second auxiliary information may include information representing distribution characteristics such as the mean and variance of the first data.
  • device A sends the first model to device B.
  • Device B uses the received first model and its own first data to perform data enhancement to obtain second data.
  • Device B then trains the second model based on the second data.
  • Two models As shown in Figure 4, it specifically includes the following processes:
  • Step 401 Device B sends a first message to device A.
  • the first message is used to request the first model, configuration parameters and first auxiliary information.
  • Step 402 Device A sends the first model, configuration parameters and first auxiliary information to device B.
  • Step 403 Device B enhances the first data based on the first model, configuration parameters and first auxiliary information to obtain second data.
  • Step 404 Device B trains the second model based on the second data to obtain a third model.
  • device B sends the first data to device A.
  • Device A uses the received first model and its own trained first model to perform data enhancement to obtain the second data.
  • Device A then uses the second data based on the second data. Train the second model to obtain the third model.
  • the third model is sent to device B.
  • the specific process includes the following:
  • Step 501 Device B sends a second message to device A, where the second message is used to request the sending of the first data;
  • Step 502 Device A sends a third message to device B.
  • the third message is used to instruct to send the first data.
  • Step 503 Device B sends the first data to device A.
  • Step 504 Device A enhances the first data based on its first model, configuration parameters and first auxiliary information to obtain second data.
  • Step 505 Device A trains the second model based on the second data to obtain a third model.
  • Step 506 Device A sends the third model to device B.
  • device B sends the first data to device A.
  • Device A uses the received first model and its own trained first model to perform data enhancement to obtain the second data.
  • Device A then adds the second data to the data.
  • Sent to device B device B uses the received second data to train the second model. As shown in Figure 6, it specifically includes the following processes:
  • Step 601 Device B sends a second message to device A.
  • the second message is used to request to send the first data
  • Step 602 Device A sends a third message to device B.
  • the third message is used to instruct to send the first data.
  • Step 603 Device B sends the first data to device A.
  • Step 604 Device A enhances the first data based on its first model, configuration parameters and first auxiliary information to obtain second data.
  • Step 605 Device A sends second data to device B.
  • Step 606 Device B trains the second model based on the second data to obtain a third model.
  • this embodiment of the present application also provides a model training processing method, including:
  • Step 701 The second device sends second information to the first device, where the second information includes a first model, and the first model is used by the first device to obtain second data based on the first data;
  • both the first data and the second data can be used to train a second model
  • the second model is a business model
  • the second data satisfies at least one of the following: when the first data is not In the case of tagged data, the second data is tagged data; in the case of the first data being tagged data, the data amount of the second data is greater than the data amount of the first data.
  • the second information also includes at least one of configuration information and first auxiliary information, wherein the configuration information is used to indicate how the first model is used, and the first auxiliary information includes statistics. information and environmental information required for the operation of the first model, and the statistical information is used to represent the distribution characteristics of the input of the first model.
  • the method before the first device receives the second information from the second device, the method further includes:
  • the second device receives a first request message from the first device, where the first request message is used to request acquisition of the second information.
  • this embodiment of the present application also provides a model training processing method, including:
  • Step 801 The second device sends first information to the first device, where the first information includes first data, and the first data is used by the first device to obtain second data based on the first model;
  • both the first data and the second data can be used to train a second model
  • the second model is a business model
  • the second data satisfies at least one of the following: when the first data is not In the case of tagged data, the second data is tagged data; in the case of the first data being tagged data, the number of the second data is The data amount is greater than the data amount of the first data.
  • the method further includes:
  • the second device receives a third model from the first device, and the first device trains the second model based on the second data.
  • the method further includes:
  • the second device receives third information from the first device, the third information including the second data.
  • the third information also includes identification information, and the identification information is used to indicate that the second data is obtained based on the first model.
  • the method further includes:
  • the second device trains the second model based on the second data to obtain a third model.
  • the method before the second device sends the first information to the first device, the method further includes:
  • the second device receives indication information from the first device, and the indication information is used to instruct the second device to send the first information.
  • the method before the second device sends the first information to the first device, the method further includes:
  • the second device sends a second request message to the first device, where the second request message is used by the second device to request to send the first information.
  • the first information also includes second auxiliary information, and the second auxiliary information is used to represent the distribution characteristics of the first data.
  • the execution subject may be a model training processing device.
  • the model training processing device executing the model training processing method is used as an example to illustrate the model training processing device provided by the embodiment of the present application.
  • an embodiment of the present application also provides a model training processing device.
  • the model training processing device 900 includes:
  • Obtaining module 901 is used to obtain first information, where the first information includes first data;
  • the processing module 902 is used to process the first data using the first model to obtain the second data;
  • both the first data and the second data can be used to train a second model
  • the second model is a business model
  • the second data satisfies at least one of the following: when the first data is not In the case of tagged data, the second data is tagged data; in the case of the first data being tagged data, the data amount of the second data is greater than the data amount of the first data.
  • model training processing device 900 also includes:
  • a first receiving module configured to receive second information from a second device, where the second information includes the first model.
  • the second information also includes at least one of configuration information and first auxiliary information, wherein the configuration information is used to indicate how the first model is used, and the first auxiliary information includes statistics. information and environmental information required for the operation of the first model, and the statistical information is used to represent the distribution characteristics of the input of the first model.
  • model training processing device 900 also includes:
  • the first sending module is configured to send a first request message to the second device, where the first request message is used to request acquisition of the second information.
  • model training processing device 900 also includes:
  • the first training module is used to train the second model based on the second data to obtain a third model.
  • model training processing device 900 also includes:
  • a first sending module configured to send the third model to the second device.
  • the acquisition module 901 includes any of the following:
  • a receiving unit configured to receive the first information from the second device
  • the acquisition unit is used to acquire the first information locally.
  • model training processing device 900 also includes:
  • the first sending module sends instruction information to the second device, where the instruction information is used to instruct the second device to send the first information.
  • the receiving unit is further configured to: receive a second request message from the second device, where the second request message is used for the second device to request to send the first information.
  • model training processing device 900 also includes:
  • a first sending module configured to send third information to the second device, where the third information includes the second data.
  • the third information also includes identification information, and the identification information is used to indicate that the second data is obtained based on the first model.
  • the first information also includes second auxiliary information, and the second auxiliary information is used to represent the distribution characteristics of the first data.
  • model training and processing device 1000 includes:
  • the second sending module 1001 is used to send second information to the first device, where the second information includes the first model,
  • the first model is used by the first device to obtain second data based on the first data;
  • both the first data and the second data can be used to train a second model
  • the second model is a business model
  • the second data satisfies at least one of the following: when the first data is not In the case of tagged data, the second data is tagged data; in the case of the first data being tagged data, the data amount of the second data is greater than the data amount of the first data.
  • the second information also includes at least one of configuration information and first auxiliary information, wherein the configuration information is used to indicate how the first model is used, and the first auxiliary information includes statistics. information and environmental information required for the operation of the first model, and the statistical information is used to represent the distribution characteristics of the input of the first model.
  • model training processing device 1000 also includes:
  • the second receiving module is configured to receive a first request message from the first device, where the first request message is used to request acquisition of the second information.
  • model training and processing device 1100 includes:
  • the third sending module 1101 is configured to send first information to the first device, where the first information includes first data, and the first data is used by the first device to obtain second data based on the first model;
  • both the first data and the second data can be used to train a second model
  • the second model is a business model
  • the second data satisfies at least one of the following: when the first data is not In the case of tagged data, the second data is tagged data; in the case of the first data being tagged data, the data amount of the second data is greater than the data amount of the first data.
  • model training processing device 1100 also includes:
  • a third receiving module is configured to receive a third model from the first device, where the first device trains the second model based on the second data.
  • model training processing device 1100 also includes:
  • a third receiving module is configured to receive third information from the first device, where the third information includes the second data.
  • the third information also includes identification information, and the identification information is used to indicate that the second data is obtained based on the first model.
  • model training processing device 1100 also includes:
  • a second training module is used to train the second model based on the second data to obtain a third model.
  • model training processing device 1100 also includes:
  • the third receiving module is configured to receive indication information from the first device, where the indication information is used to instruct the second device to send the first information.
  • the third sending module 1101 is also configured to send a second request message to the first device, where the second request message is used for the second device to request to send the first information.
  • the first information also includes second auxiliary information, and the second auxiliary information is used to represent the distribution characteristics of the first data.
  • the model training processing device in the embodiment of the present application may be an electronic device, such as an electronic device with an operating system, or may be a component in the electronic device, such as an integrated circuit or chip.
  • the electronic device may be a terminal or other devices other than the terminal.
  • terminals may include but are not limited to the types of terminals 11 listed above, and other devices may be servers, network attached storage (Network Attached Storage, NAS), etc., which are not specifically limited in the embodiment of this application.
  • NAS Network Attached Storage
  • the model training processing device provided by the embodiment of the present application can implement each process implemented by the method embodiments of Figures 3 to 8, and achieve the same technical effect. To avoid duplication, the details will not be described here.
  • this embodiment of the present application also provides a communication device 1200, which includes a processor 1201 and a memory 1202.
  • the memory 1202 stores programs or instructions that can be run on the processor 1201.
  • each step of the above model training processing method embodiment is implemented, and the same technical effect can be achieved. To avoid duplication, the details will not be described here.
  • An embodiment of the present application also provides a terminal, including a processor and a communication interface.
  • the processor is used to obtain first information, where the first information includes first data; using the first model to The first data is processed to obtain second data; wherein both the first data and the second data can be used to train a second model, the second model is a business model, and the second data satisfies at least the following: One item: when the first data is untagged data, the second data is tagged data; when the first data is tagged data, the second data The data amount is greater than the data amount of the first data;
  • the communication interface is used to send second information to the first device, where the second information includes a first model, and the first model is used by the first device based on the first
  • the second data is obtained from the data; wherein both the first data and the second data can be used to train a second model, the second model is a business model, and the second data satisfies at least one of the following: When the first data is untagged data, the second data is tagged data; when the first data is tagged data, the data amount of the second data is greater than the data amount. The data amount of the first data.
  • the communication interface is used to send first information to the first device, the first information includes first data, and the first data is used by the first device based on the first
  • the model obtains second data; wherein both the first data and the second data can be used to train a second model, the second model is a business model, and the second data satisfies at least one of the following: When the first data is untagged data, the second data is tagged data; when the first data is tagged data, the data amount of the second data is greater than the data amount. The data amount of the first data.
  • FIG. 13 is a schematic diagram of the hardware structure of a terminal that implements an embodiment of the present application.
  • the terminal 1300 includes but is not limited to: a radio frequency unit 1301, a network module 1302, an audio output unit 1303, an input unit 1304, a sensor 1305, a display unit 1306, a user input unit 1307, an interface unit 1308, a memory 1309, a processor 1310, etc. At least some parts.
  • the terminal 1300 may also include a power supply (such as a battery) that supplies power to various components.
  • the power supply may be logically connected to the processor 1310 through a power management system, thereby managing charging, discharging, and power consumption through the power management system. Management and other functions.
  • the terminal structure shown in FIG. 13 does not constitute a limitation on the terminal.
  • the terminal may include more or fewer components than shown in the figure, or some components may be combined or arranged differently, which will not be described again here.
  • the input unit 1304 may include a graphics processing unit (Graphics Processing Unit, GPU) 13041 and a microphone 13042.
  • the graphics processor 13041 is responsible for the image capture device (GPU) in the video capture mode or the image capture mode. Process the image data of still pictures or videos obtained by cameras (such as cameras).
  • the display unit 1306 may include a display panel 13061, which may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like.
  • the user input unit 1307 includes a touch panel 13071 and at least one of other input devices 13072 . Touch panel 13071, also called touch screen.
  • the touch panel 13071 may include two parts: a touch detection device and a touch controller.
  • Other input devices 13072 may include but are not limited to physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, and joysticks, which will not be described again here.
  • the radio frequency unit 1301 after receiving downlink data from the network side device, the radio frequency unit 1301 can transmit it to the processor 1310 for processing; in addition, the radio frequency unit 1301 can send uplink data to the network side device.
  • the radio frequency unit 1301 includes, but is not limited to, an antenna, amplifier, transceiver, coupler, low noise amplifier, duplexer, etc.
  • Memory 1309 may be used to store software programs or instructions as well as various data.
  • Memory 1309 may primarily include storage A first storage area for programs or instructions and a second storage area for storing data, where the first storage area can store an operating system, an application program or instructions required for at least one function (such as a sound playback function, an image playback function, etc.), etc. .
  • memory 1309 may include volatile memory or nonvolatile memory, or memory 1309 may include both volatile and nonvolatile memory.
  • the non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electrically removable memory.
  • Volatile memory can be random access memory (Random Access Memory, RAM), static random access memory (Static RAM, SRAM), dynamic random access memory (Dynamic RAM, DRAM), synchronous dynamic random access memory (Synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (Double Data Rate SDRAM, DDRSDRAM), enhanced synchronous dynamic random access memory (Enhanced SDRAM, ESDRAM), synchronous link dynamic random access memory (Synch link DRAM) , SLDRAM) and direct memory bus random access memory (Direct Rambus RAM, DRRAM).
  • RAM Random Access Memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • synchronous dynamic random access memory Synchronous DRAM, SDRAM
  • Double data rate synchronous dynamic random access memory Double Data Rate SDRAM, DDRSDRAM
  • Enhanced SDRAM, ESDRAM synchronous link dynamic random access memory
  • Synch link DRAM synchronous link dynamic random access memory
  • SLDRAM direct memory bus random access memory
  • the processor 1310 may include one or more processing units; optionally, the processor 1310 integrates an application processor and a modem processor, where the application processor mainly handles operations related to the operating system, user interface, application programs, etc., Modem processors mainly process wireless communication signals, such as baseband processors. It can be understood that the above modem processor may not be integrated into the processor 1310.
  • the processor 1310 when the terminal is a first device, the processor 1310 is used to obtain first information, and the first information includes first data; use the first model to process the first data to obtain the second data; wherein, Both the first data and the second data can be used to train a second model, the second model is a business model, and the second data satisfies at least one of the following: the first data is unlabeled In the case of data, the second data is tagged data; in the case of the first data being tagged data, the data amount of the second data is greater than the data amount of the first data;
  • the radio frequency unit 1301 is configured to send second information to the first device, where the second information includes a first model, and the first model is used by the first device based on the first device.
  • the second information includes a first model
  • the first model is used by the first device based on the first device.
  • the first data is untagged data
  • the second data is tagged data
  • the data amount of the second data is greater than the data amount. Describe the data amount of the first data.
  • the radio frequency unit 1301 is configured to send first information to the first device, where the first information includes first data, and the first data is used by the first device based on the first A model obtains second data; wherein both the first data and the second data can be used to train a second model, the second model is a business model, and the second data satisfies at least one of the following: When the first data is untagged data, the second data is tagged data; when the first data is tagged data, the data amount of the second data is greater than the data amount. Describe the data amount of the first data.
  • An embodiment of the present application also provides a network side device, including a processor and a communication interface.
  • the processor is used to obtain first information, where the first information includes first data; using The first model processes the first data to obtain second data; wherein both the first data and the second data can be used to train the second model, the second model is a business model, and the second The data satisfies at least one of the following: when the first data is untagged data, the second data is tagged data; when the first data is tagged data, the second data is tagged data. The data amount of the second data is greater than the data amount of the first data;
  • the communication interface is used to send second information to the first device, where the second information includes a first model, and the first model is used by the first device based on The first data obtains the second data; wherein both the first data and the second data can be used to train a second model, the second model is a business model, and the second data satisfies at least one of the following: When the first data is untagged data, the second data is tagged data; when the first data is tagged data, the data amount of the second data is greater than The data amount of the first data.
  • the communication interface is used to send first information to the first device, the first information includes first data, and the first data is used by the first device based on The first model obtains the second data; wherein both the first data and the second data can be used to train the second model, the second model is a business model, and the second data satisfies at least one of the following: When the first data is untagged data, the second data is tagged data; when the first data is tagged data, the data amount of the second data is greater than The data amount of the first data.
  • This network-side device embodiment corresponds to the above-mentioned network-side device method embodiment.
  • Each implementation process and implementation manner of the above-mentioned method embodiment can be applied to this network-side device embodiment, and can achieve the same technical effect.
  • the embodiment of the present application also provides a network side device.
  • the network side device 1400 includes: an antenna 1401, a radio frequency device 1402, a baseband device 1403, a processor 1404 and a memory 1405.
  • Antenna 1401 is connected to radio frequency device 1402.
  • the radio frequency device 1402 receives information through the antenna 1401 and will receive The information is sent to the baseband device 1403 for processing.
  • the baseband device 1403 processes the information to be sent and sends it to the radio frequency device 1402.
  • the radio frequency device 1402 processes the received information and then sends it out through the antenna 1401.
  • the method performed by the network side device in the above embodiment can be implemented in the baseband device 1403, which includes a baseband processor.
  • the baseband device 1403 may include, for example, at least one baseband board, which is provided with multiple chips, as shown in FIG. Program to perform the network device operations shown in the above method embodiments.
  • the network side device may also include a network interface 1406, which is, for example, a common public radio interface (CPRI).
  • a network interface 1406 which is, for example, a common public radio interface (CPRI).
  • CPRI common public radio interface
  • the network side device 1400 in this embodiment of the present invention also includes: instructions or programs stored in the memory 1405 and executable on the processor 1404.
  • the processor 1404 calls the instructions or programs in the memory 1405 to execute the steps shown in Figures 9 to 11. It shows the execution method of each module and achieves the same technical effect. To avoid duplication, it will not be repeated here.
  • Embodiments of the present application also provide a readable storage medium.
  • Programs or instructions are stored on the readable storage medium.
  • the program or instructions are executed by a processor, each process of the above model training processing method embodiment is implemented, and can achieve The same technical effects are not repeated here to avoid repetition.
  • the processor is the processor in the terminal described in the above embodiment.
  • the readable storage medium includes computer readable storage media, such as computer read-only memory ROM, random access memory RAM, magnetic disk or optical disk, etc.
  • An embodiment of the present application further provides a chip.
  • the chip includes a processor and a communication interface.
  • the communication interface is coupled to the processor.
  • the processor is used to run programs or instructions to implement the above embodiments of the model training processing method. Each process can achieve the same technical effect. To avoid repetition, we will not go into details here.
  • chips mentioned in the embodiments of this application may also be called system-on-chip, system-on-a-chip, system-on-chip or system-on-chip, etc.
  • Embodiments of the present application further provide a computer program/program product.
  • the computer program/program product is stored in a storage medium.
  • the computer program/program product is executed by at least one processor to implement the above model training processing method.
  • Each process in the example can achieve the same technical effect. To avoid repetition, we will not repeat it here.
  • the methods of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better. implementation.
  • the technical solution of the present application can be embodied in the form of a computer software product that is essentially or contributes to the existing technology.
  • the computer software product is stored in a storage medium (such as ROM/RAM, disk , CD), including several instructions to cause a terminal (which can be a mobile phone, computer, server, air conditioner, or network device, etc.) to execute the methods described in various embodiments of this application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Algebra (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Operations Research (AREA)
  • Neurology (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

本申请公开了一种模型训练处理方法、装置、终端及网络侧设备,属于通信技术领域,本申请实施例的模型训练处理方法包括:第一设备获取第一信息,所述第一信息包括第一数据;所述第一设备利用第一模型对第一数据进行处理,获得第二数据;其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。

Description

模型训练处理方法、装置、终端及网络侧设备
相关申请的交叉引用
本申请主张在2022年05月06日在中国提交的中国专利申请No.202210489247.9的优先权,其全部内容通过引用包含于此。
技术领域
本申请属于通信技术领域,具体涉及一种模型训练处理方法、装置、终端及网络侧设备。
背景技术
随着通信技术的发展,在无线通信中引入了基于人工智能(Artificial Intelligence,AI)的通信场景。目前基于AI的在无线通信的很多场景下,很难获取大量的带标签数据。没有大量的带标签数据,则无法通过监督学习训练出合适的模型,从而导致通信的可靠性较低。因此相关技术中,存在基于AI的无线通信的可靠性较低的问题。
发明内容
本申请实施例提供一种模型训练处理方法、装置、终端及网络侧设备,提高了基于AI的无线通信的可靠性。
第一方面,提供了一种模型训练处理方法,包括:
第一设备获取第一信息,所述第一信息包括第一数据;
所述第一设备利用第一模型对第一数据进行处理,获得第二数据;
其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
第二方面,提供了一种模型训练处理方法,包括:
第二设备向第一设备发送第二信息,所述第二信息包括第一模型,所述第一模型用于所述第一设备基于第一数据获得第二数据;
其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数 据量大于所述第一数据的数据量。
第三方面,提供了一种模型训练处理装置,包括:
获取模块,用于获取第一信息,所述第一信息包括第一数据;
处理模块,用于利用第一模型对第一数据进行处理,获得第二数据;
其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
第四方面,提供了一种模型训练处理装置,包括:
第二发送模块,用于向第一设备发送第二信息,所述第二信息包括第一模型,所述第一模型用于所述第一设备基于第一数据获得第二数据;
其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
第五方面,提供了一种模型训练处理装置,包括:
第二发送模块,用于向第一设备发送第二信息,所述第二信息包括第一模型,所述第一模型用于所述第一设备基于第一数据获得第二数据;
其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
第六方面,提供了一种模型训练处理装置,包括:
第三发送模块,用于向第一设备发送第一信息,所述第一信息包括第一数据,所述第一数据用于所述第一设备基于第一模型获得第二数据;
其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
第七方面,提供了一种终端,该终端包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如第一方面所述的方法的步骤。
第八方面,提供了一种终端,包括处理器及通信接口,其中,
在终端为第一设备时,所述处理器用于获取第一信息,所述第一信息包括第一数据;利用第一模型对第一数据进行处理,获得第二数据;其中,所述第一数据和所述第二数据 均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量;
或者,在终端为第一设备时,所述通信接口用于向第一设备发送第二信息,所述第二信息包括第一模型,所述第一模型用于所述第一设备基于第一数据获得第二数据;其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
或者,在终端为第一设备时,所述通信接口用于向第一设备发送第一信息,所述第一信息包括第一数据,所述第一数据用于所述第一设备基于第一模型获得第二数据;其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
第九方面,提供了一种网络侧设备,该网络侧设备包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如第二方面所述的方法的步骤。
第十方面,提供了一种网络侧设备,包括处理器及通信接口,其中,
在网络侧设备为第一设备时,所述处理器用于获取第一信息,所述第一信息包括第一数据;利用第一模型对第一数据进行处理,获得第二数据;其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量;
或者,在网络侧设备为第一设备时,所述通信接口用于向第一设备发送第二信息,所述第二信息包括第一模型,所述第一模型用于所述第一设备基于第一数据获得第二数据;其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
或者,在网络侧设备为第一设备时,所述通信接口用于向第一设备发送第一信息,所述第一信息包括第一数据,所述第一数据用于所述第一设备基于第一模型获得第二数据;其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量 大于所述第一数据的数据量。
第十一方面,提供了一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如第一方面所述的方法的步骤,或者实现如第二方面所述的方法的步骤。
第十二方面,提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现如第一方面所述的方法的步骤,或实现如第二方面所述的方法的步骤。
第十三方面,提供了一种计算机程序/程序产品,所述计算机程序/程序产品被存储在存储介质中,所述计算机程序/程序产品被至少一个处理器执行以实现如第一方面所述的方法的步骤,或实现如第二方面所述的方法的步骤。
本申请实施例中,通过第一设备获取第一信息,所述第一信息包括第一数据;所述第一设备利用第一模型对第一数据进行处理,获得第二数据;其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。这样,由于利用第一模型可以获得更多带标签的训练数据,从而可以使得第二模型在训练过程有效收敛,提升第二模型的性能。因此,本申请实施例可以提高基于AI的无线通信的可靠性。
附图说明
图1是本申请实施例可应用的一种网络系统的结构图;
图2是神经元的结构示意图;
图3是本申请实施例提供的一种模型训练处理方法的流程图之一;
图4是本申请实施例提供的一种模型训练处理方法的流程图之二;
图5是本申请实施例提供的一种模型训练处理方法的流程图之三;
图6是本申请实施例提供的一种模型训练处理方法的流程图之四;
图7是本申请实施例提供的一种模型训练处理方法的流程图之五;
图8是本申请实施例提供的一种模型训练处理方法的流程图之六;
图9是本申请实施例提供的一种模型训练处理装置的结构图之一;
图10是本申请实施例提供的一种模型训练处理装置的结构图之二;
图11是本申请实施例提供的一种模型训练处理装置的结构图之三;
图12是本申请实施例提供的一种通信设备的结构图;
图13是本申请实施例提供的一种终端的结构图;
图14是本申请实施例提供的另一种网络侧设备的结构图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象,而不用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况下可以互换,以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施,且“第一”、“第二”所区别的对象通常为一类,并不限定对象的个数,例如第一对象可以是一个,也可以是多个。此外,说明书以及权利要求中“和/或”表示所连接对象的至少其中之一,字符“/”一般表示前后关联对象是一种“或”的关系。
值得指出的是,本申请实施例所描述的技术不限于长期演进型(Long Term Evolution,LTE)/LTE的演进(LTE-Advanced,LTE-A)系统,还可用于其他无线通信系统,诸如码分多址(Code Division Multiple Access,CDMA)、时分多址(Time Division Multiple Access,TDMA)、频分多址(Frequency Division Multiple Access,FDMA)、正交频分多址(Orthogonal Frequency Division Multiple Access,OFDMA)、单载波频分多址(Single-carrier Frequency Division Multiple Access,SC-FDMA)和其他系统。本申请实施例中的术语“系统”和“网络”常被可互换地使用,所描述的技术既可用于以上提及的系统和无线电技术,也可用于其他系统和无线电技术。以下描述出于示例目的描述了新空口(New Radio,NR)系统,并且在以下大部分描述中使用NR术语,但是这些技术也可应用于NR系统应用以外的应用,如第6代(6th Generation,6G)通信系统。
图1示出本申请实施例可应用的一种无线通信系统的框图。无线通信系统包括终端11和网络侧设备12。其中,终端11可以是手机、平板电脑(Tablet Personal Computer)、膝上型电脑(Laptop Computer)或称为笔记本电脑、个人数字助理(Personal Digital Assistant,PDA)、掌上电脑、上网本、超级移动个人计算机(ultra-mobile personal computer,UMPC)、移动上网装置(Mobile Internet Device,MID)、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、机器人、可穿戴式设备(Wearable Device)、车载设备(Vehicle User Equipment,VUE)、行人终端(Pedestrian User Equipment,PUE)、智能家居(具有无线通信功能的家居设备,如冰箱、电视、洗衣机或者家具等)、游戏机、个人计算机(personal  computer,PC)、柜员机或者自助机等终端侧设备,可穿戴式设备包括:智能手表、智能手环、智能耳机、智能眼镜、智能首饰(智能手镯、智能手链、智能戒指、智能项链、智能脚镯、智能脚链等)、智能腕带、智能服装等。需要说明的是,在本申请实施例并不限定终端11的具体类型。网络侧设备12可以包括接入网设备或核心网设备,其中,接入网设备也可以称为无线接入网设备、无线接入网(Radio Access Network,RAN)、无线接入网功能或无线接入网单元。接入网设备可以包括基站、无线局域网(Wireless Local Area Network,WLAN)接入点或WiFi节点等,基站可被称为节点B、演进节点B(eNB)、接入点、基收发机站(Base Transceiver Station,BTS)、无线电基站、无线电收发机、基本服务集(Basic Service Set,BSS)、扩展服务集(Extended Service Set,ESS)、家用B节点、家用演进型B节点、发送接收点(Transmitting Receiving Point,TRP)或所述领域中其他某个合适的术语,只要达到相同的技术效果,所述基站不限于特定技术词汇,需要说明的是,在本申请实施例中仅以NR系统中的基站为例进行介绍,并不限定基站的具体类型。
为了方便理解,以下对本申请实施例涉及的一些内容进行说明:
一、人工智能。
人工智能目前在各个领域获得了广泛的应用。AI模块有多种实现方式,例如神经网络、决策树、支持向量机、贝叶斯分类器等。本申请以神经网络为例进行说明,但是并不限定AI模块的具体类型。
神经网络由神经元组成,神经元的示意图如图2所示。其中,ai为输入,w为权值(乘性系数),b为偏置(加性系数),σ(.)为激活函数。常见的激活函数包括S型函数(Sigmoid)、双曲正切函数(tanh)和线性整流函数ReLU(Rectified Linear Unit)等。
神经网络的参数通过优化算法进行优化。优化算法就是一种能够帮我们最小化或者最大化目标函数(有时候也叫损失函数)的一类算法。而目标函数往往是模型参数和数据的数学组合。例如给定数据X和其对应的标签Y,我们构建一个神经网络模型f(.),有了模型后,根据输入x就可以得到预测输出f(x),并且可以计算出预测值和真实值之间的差距(f(x)-Y),这个就是损失函数。我们的目的是找到合适的W,b使上述的损失函数的值达到最小,损失值越小,则说明我们的模型越接近于真实情况。
目前常见的优化算法,基本都是基于误差反向传播(error Back Propagation,BP)算法。BP算法的基本思想是,学习过程由信号的正向传播与误差的反向传播两个过程组成。 正向传播时,输入样本从输入层传入,经各隐层逐层处理后,传向输出层。若输出层的实际输出与期望的输出不符,则转入误差的反向传播阶段。误差反传是将输出误差以某种形式通过隐层向输入层逐层反传,并将误差分摊给各层的所有单元,从而获得各层单元的误差信号,此误差信号即作为修正各单元权值的依据。这种信号正向传播与误差反向传播的各层权值调整过程,是周而复始地进行的。权值不断调整的过程,也就是网络的学习训练过程。此过程一直进行到网络输出的误差减少到可接受的程度,或进行到预先设定的学习次数为止。
根据解决类型不同,选取的AI算法和采用的模型也有所差别。目前,借助AI提升5G网络性能的主要方法是通过基于神经网络的算法和模型增强或者替代目前已有的算法或处理模块。在特定场景下,基于神经网络的算法和模型可以取得比基于确定性算法更好的性能。比较常用的神经网络包括深度神经网络、卷积神经网络和循环神经网络等。借助已有AI工具,可以实现神经网络的搭建、训练与验证工作。
应理解,AI模型的训练需要大量数据的支撑。如果数据量不足,则模型的训练过程可能不收敛,或者训练出来的模型会过拟合。然而,无线通信中很多场景无法拿到带标签的数据,或者带标签的数据量较少(由于采集开销、传输开销等导致)。因此,在无线通信中需要解决带标签数据不足时的模型训练问题。为此提出了本申请的模型训练处理方法。
下面结合附图,通过一些实施例及其应用场景对本申请实施例提供的模型训练处理方法进行详细地说明。
如图3所示,本申请实施例提供了一种模型训练处理方法,包括:
步骤301,第一设备获取第一信息,所述第一信息包括第一数据;
步骤302,所述第一设备利用第一模型对第一数据进行处理,获得第二数据;
其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
本申请实施例中,上述第一设备可以为网络侧设备,也可以为终端,上述第一数据可以为用于训练第二模型的至少部分数据。第一数据可以为带标签的数据,也可以为不带标签的数据。上述第一模型可以理解为对第二模型的训练数据进行增强的模型,例如,当第一数据为带标签的数据时,上述第一模型用于对第一数据进行扩展,从而获得数据量更多 的第二数据;当第一数据为未带标签的数据时,上述第一模型用于对第一数据进行打标签处理,从而可以获得更多的带标签的训练数据。这样通过第一模型对第一数据进行处理后,可以获得更多的带标签的训练数据,从而可以保证有足够的带标签的训练数据对第二模型进行训练,进而可以使得第二模型在训练过程有效收敛,提升第二模型的性能。
例如,在一些实施例中,第一数据为带标签的N个数据,则将带标签的N个数据输入到第一模型后可以输出带标签的M个数据(即第二数据为带标签的M个数据),此时M大于N,通常地,M远大于N。
又例如,在一些实施例中,第一数据为未带标签的M个数据,则将带标签的N个数据输入到第一模型后可以输出带标签的M个数据(即第二数据为带标签的M个数据)。
可选地,上述第一信息可以存储在第一设备中,也可以储存在第二设备中。与此同时,上述第一模型可以存储在第一设备中,也可以储存在第二设备中,由第二设备发送给第一设备。其中,当第一设备为核心网设备时,第二设备可以为基站;当第一设备为基站(如基站A)时,第二设备可以为基站(如基站B)或终端;当第一设备为终端(如终端A)时,第二设备可以为基站或者终端(如终端B)。应理解,在本申请实施例中,上述第一信息和第一模型储存在不同的设备中。
本申请实施例中,通过第一设备获取第一信息,所述第一信息包括第一数据;所述第一设备利用第一模型对第一数据进行处理,获得第二数据;其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。这样,由于利用第一模型可以获得更多带标签的训练数据,从而可以使得第二模型在训练过程有效收敛,提升第二模型的性能。因此,本申请实施例可以提高基于AI的无线通信的可靠性。
可选地,在一些实施中,上述第一模型存储在第二设备中,则所述第一设备利用第一模型对第一数据进行处理之前,所述方法还包括:
所述第一设备从第二设备接收第二信息,所述第二信息包括所述第一模型。
本申请实施例中,上述第二信息包括所述第一模型可以理解为:第二信息中包括第一模型的参数或者包括第一模型的地址信息,以使第一设备可以获得第一模型。
进一步地,在一些实施例中,所述第二信息还包括配置信息和第一辅助信息中的至少 一项,其中,所述配置信息用于指示所述第一模型的使用方式,所述第一辅助信息包括统计信息以及所述第一模型运行所需的环境信息,所述统计信息用于表示所述第一模型的输入的分布特征。
可选地,上述配置信息用于指示第一模型的使用方法,例如可以包括第一模型的数据维度或输入数据格式、输出维度或输出数据格式、输入数据量和输出数据量等。上述环境信息可以理解为与第一模型的数据增强算法有关的环境信息,该环境信息可以包括模型运行需要的软件环境和硬件环境等,例如可以包括需要使用的软件架构、硬件架构、电量需求、存储需求和算力需求等。上述统计信息可以包括模型输入的均值、方差等分布特征信息。
可选地,在一些实施例中,所述第一设备从第二设备接收第二信息之前,所述方法还包括:
所述第一设备向第二设备发送第一请求消息,所述第一请求消息用于请求获取所述第二信息。
本申请实施例中,在第一设备需要进行对第一数据进行扩展时,可以通过请求的方式获取上述第二信息,从而可以提高第二信息获取的针对性。当然在其他实施例中,还可以由第二设备主动向第一设备发送所述第二信息,例如在第二设备与第一设备建立连接时,第二设备将第二信息发送给第一设备,也可以由第二设备广播第二信息,在第一设备需要第二信息时,直接从广播的信息中获取。
可选地,在一些实施例中,所述第一设备利用第一模型对第一数据进行处理,获得第二数据之后,所述方法还包括:
所述第一设备基于所述第二数据对所述第二模型进行训练,得到第三模型。
本申请实施例中,可以由第一设备对第二模型进行训练,得到第三模型。其中第三模型的使用可以在第一设备上,也可以在第二设备上。
应理解,上述第二模型可以是第二设备发送给第一设备的,也可以是协议预配置在第一设备的,在此不做进一步的限定。
可选地,在一些实施例中,若第三模型的使用在第二设备上,则所述第一设备基于所述第二数据对所述第二模型进行训练,得到第三模型之后,所述方法还包括:
所述第一设备向所述第二设备发送所述第三模型。
本申请实施例中,发送第三模型可以理解为发送第三模型的参数或者发送第三模型的 地址信息。在此不做进一步的限定。这样利用训练好的第三模型执行相应的业务的推理时,可以提高推理的准确性,从而保证通信的可靠性。
需要说明的是,上述第二模型的初始训练数据可以包括上述第一数据,还可以包括带标签的第三数据,在对第二模型进行训练时,若第一数据为带标签的数据,则对第二模型进行训练时,采用训练数据可以包括第一数据、第二数据和第三数据;若第一数据为未带标签的数据,则对第二模型进行训练时,采用训练数据可以包括第二数据和第三数据。
可选地,基于第一信息存储的位置不同,对应的获取第一信息的方式不同,例如,在一些实施例中,所述第一设备获取第一信息包括以下任一项:
所述第一设备从第二设备接收第一信息;
所述第一设备从本地获取第一信息。
本申请实施例中,当第一信息存储在第二设备时,则第一设备可以从第二设备接收第一信息;当第一信息存储在第一设备时,则第一设备可以从本地获取第一信息。
可选地,所述第一设备从第二设备接收第一信息之前,所述方法还包括:
所述第一设备向所述第二设备发送指示信息,所述指示信息用于指示所述第二设备发送所述第一信息。
本申请实施例中,第一信息储存在第二设备中,需要由第一设备指示第二设备发送第一信息,例如可以由第一设备调度第二设备发送第一信息。
可选地,在一些实施例中,所述第一设备从第二设备接收第一信息之前,所述方法还包括:
所述第一设备从所述第二设备接收第二请求消息,所述第二请求消息用于所述第二设备请求发送所述第一信息。
本申请实施例中,在第二设备发送第一信息之前,第二设备可以向第一设备请求发送第一信息,在此之后,第二设备可以在预配置的资源上进行第一信息的发送,也可以由第一设备动态调度第二设备发送第一信息,例如通过上述指示信息指示第二设备发送第一信息。
可选地,在所述第一设备从第二设备接收第一信息的情况下,所述第一设备利用第一模型对第一数据进行处理,获得第二数据之后,所述方法还包括:
所述第一设备向所述第二设备发送第三信息,所述第三信息包括所述第二数据。
本申请实施例中,上述第二模型的训练的过程由第二设备执行,则第一设备需要向第 二设备发送第二数据,以供第二设备进行第二模型的训练,得到第三模型。第二设备对第二模型的训练与第一设备对第二模型的训练类似,训练数据的定义可以参照上述实例,在此不再赘述。
进一步地,在一些实施例中,所述第三信息还包括标识信息,所述标识信息用于指示所述第二数据基于所述第一模型获得。
可选地,在一些实施例中,所述第一信息还包括第二辅助信息,所述第二辅助信息用于表示所述第一数据的分布特征。
本申请实施例中,上述第二辅助信息可以包括第一数据的均值和方差等表示分布特征的信息。
为了更好的理解本申请,以下通过一些实例进行说明。
在一些实施例中,设备A把第一模型发给设备B,设备B用接收到的第一模型和自己的第一数据进行数据增强,获得第二数据,设备B再基于第二数据训练第二模型。如图4所示,具体包括以下流程:
步骤401,设备B向设备A发送第一消息,第一消息用于请求第一模型、配置参数和第一辅助信息。
步骤402,设备A向设备B发送第一模型、配置参数和第一辅助信息。
步骤403,设备B基于第一模型、配置参数和第一辅助信息对第一数据进行增强,获得第二数据。
步骤404,设备B基于第二数据训练第二模型,得到第三模型。
在一些实施例中,设备B把第一数据发送给设备A,设备A用接收到的第一模型和自己训练好的第一模型进行数据增强,获得第二数据,设备A再基于第二数据训练第二模型,得到第三模型。最后将第三模型发送给设备B。如图5所示,具体包括以下流程:
步骤501,设备B向设备A发送第二消息,第二消息用于请求发送第一数据;
步骤502,设备A向设备B发送第三消息,第三消息用于指示发送第一数据。
步骤503,设备B向设备A发送第一数据。
步骤504,设备A基于自己的第一模型、配置参数和第一辅助信息对第一数据进行增强,获得第二数据。
步骤505,设备A基于第二数据训练第二模型,得到第三模型。
步骤506,设备A向设备B发送第三模型。
在一些实施例中,设备B把第一数据发送给设备A,设备A用接收到的第一模型和自己训练好的第一模型进行数据增强,获得第二数据,设备A再将第二数据发送给设备B,设备B用接收到的第二数据训练第二模型。如图6所示,具体包括以下流程:
步骤601,设备B向设备A发送第二消息,第二消息用于请求发送第一数据;
步骤602,设备A向设备B发送第三消息,第三消息用于指示发送第一数据。
步骤603,设备B向设备A发送第一数据。
步骤604,设备A基于自己的第一模型、配置参数和第一辅助信息对第一数据进行增强,获得第二数据。
步骤605,设备A向设备B发送第二数据。
步骤606,设备B基于第二数据训练第二模型,得到第三模型。
参照图7,本申请实施例还提供了一种模型训练处理方法,包括:
步骤701,第二设备向第一设备发送第二信息,所述第二信息包括第一模型,所述第一模型用于所述第一设备基于第一数据获得第二数据;
其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
可选地,所述第二信息还包括配置信息和第一辅助信息中的至少一项,其中,所述配置信息用于指示所述第一模型的使用方式,所述第一辅助信息包括统计信息以及所述第一模型运行所需的环境信息,所述统计信息用于表示所述第一模型的输入的分布特征。
可选地,所述第一设备从第二设备接收第二信息之前,所述方法还包括:
所述第二设备从所述第一设备接收第一请求消息,所述第一请求消息用于请求获取所述第二信息。
参照图8,本申请实施例还提供了一种模型训练处理方法,包括:
步骤801,第二设备向第一设备发送第一信息,所述第一信息包括第一数据,所述第一数据用于所述第一设备基于第一模型获得第二数据;
其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数 据量大于所述第一数据的数据量。
可选地,所述第二设备向第一设备发送第一信息之后,所述方法还包括:
所述第二设备从第一设备接收第三模型,所述第三模型所述第一设备基于所述第二数据对所述第二模型进行训练得到。
可选地,所述第二设备向第一设备发送第一信息之后,所述方法还包括:
所述第二设备从第一设备接收第三信息,所述第三信息包括所述第二数据。
可选地,所述第三信息还包括标识信息,所述标识信息用于指示所述第二数据基于所述第一模型获得。
可选地,所述第二设备从第一设备接收第三信息之后,所述方法还包括:
所述第二设备基于所述第二数据对所述第二模型进行训练,获得第三模型。
可选地,所述第二设备向第一设备发送第一信息之前,所述方法还包括:
所述第二设备从所述第一设备接收指示信息,所述指示信息用于指示所述第二设备发送所述第一信息。
可选地,所述第二设备向第一设备发送第一信息之前,所述方法还包括:
所述第二设备向所述第一设备发送第二请求消息,所述第二请求消息用于所述第二设备请求发送所述第一信息。
可选地,所述第一信息还包括第二辅助信息,所述第二辅助信息用于表示所述第一数据的分布特征。
本申请实施例提供的模型训练处理方法,执行主体可以为模型训练处理装置。本申请实施例中以模型训练处理装置执行模型训练处理方法为例,说明本申请实施例提供的模型训练处理装置。
参照图9,本申请实施例还提供了一种模型训练处理装置,如图9所示,该模型训练处理装置900,包括:
获取模块901,用于获取第一信息,所述第一信息包括第一数据;
处理模块902,用于利用第一模型对第一数据进行处理,获得第二数据;
其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
可选地,所述模型训练处理装置900还包括:
第一接收模块,用于从第二设备接收第二信息,所述第二信息包括所述第一模型。
可选地,所述第二信息还包括配置信息和第一辅助信息中的至少一项,其中,所述配置信息用于指示所述第一模型的使用方式,所述第一辅助信息包括统计信息以及所述第一模型运行所需的环境信息,所述统计信息用于表示所述第一模型的输入的分布特征。
可选地,所述模型训练处理装置900还包括:
第一发送模块,用于向第二设备发送第一请求消息,所述第一请求消息用于请求获取所述第二信息。
可选地,所述模型训练处理装置900还包括:
第一训练模块,用于基于所述第二数据对所述第二模型进行训练,得到第三模型。
可选地,所述模型训练处理装置900还包括:
第一发送模块,用于向所述第二设备发送所述第三模型。
可选地,所述获取模块901包括以下以下任一项:
接收单元,用于从第二设备接收第一信息;
获取单元,用于从本地获取第一信息。
可选地,所述模型训练处理装置900还包括:
第一发送模块,向所述第二设备发送指示信息,所述指示信息用于指示所述第二设备发送所述第一信息。
可选地,所述接收单元还用于:从所述第二设备接收第二请求消息,所述第二请求消息用于所述第二设备请求发送所述第一信息。
可选地,所述模型训练处理装置900还包括:
第一发送模块,用于向所述第二设备发送第三信息,所述第三信息包括所述第二数据。
可选地,所述第三信息还包括标识信息,所述标识信息用于指示所述第二数据基于所述第一模型获得。
可选地,所述第一信息还包括第二辅助信息,所述第二辅助信息用于表示所述第一数据的分布特征。
参照图10,本申请实施例还提供了一种模型训练处理装置,如图10所示,该模型训练处理装置1000,包括:
第二发送模块1001,用于向第一设备发送第二信息,所述第二信息包括第一模型, 所述第一模型用于所述第一设备基于第一数据获得第二数据;
其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
可选地,所述第二信息还包括配置信息和第一辅助信息中的至少一项,其中,所述配置信息用于指示所述第一模型的使用方式,所述第一辅助信息包括统计信息以及所述第一模型运行所需的环境信息,所述统计信息用于表示所述第一模型的输入的分布特征。
可选地,所述模型训练处理装置1000还包括:
第二接收模块,用于从所述第一设备接收第一请求消息,所述第一请求消息用于请求获取所述第二信息。
参照图11,本申请实施例还提供了一种模型训练处理装置,如图11所示,该模型训练处理装置1100,包括:
第三发送模块1101,用于向第一设备发送第一信息,所述第一信息包括第一数据,所述第一数据用于所述第一设备基于第一模型获得第二数据;
其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
可选地,所述模型训练处理装置1100还包括:
第三接收模块,用于从第一设备接收第三模型,所述第三模型所述第一设备基于所述第二数据对所述第二模型进行训练得到。
可选地,所述模型训练处理装置1100还包括:
第三接收模块,用于从第一设备接收第三信息,所述第三信息包括所述第二数据。
可选地,所述第三信息还包括标识信息,所述标识信息用于指示所述第二数据基于所述第一模型获得。
可选地所述模型训练处理装置1100还包括:
第二训练模块,用于基于所述第二数据对所述第二模型进行训练,获得第三模型。
可选地,所述模型训练处理装置1100还包括:
第三接收模块,用于从所述第一设备接收指示信息,所述指示信息用于指示所述第二设备发送所述第一信息。
可选地,所述第三发送模块1101,还用于向所述第一设备发送第二请求消息,所述第二请求消息用于所述第二设备请求发送所述第一信息。
可选地,所述第一信息还包括第二辅助信息,所述第二辅助信息用于表示所述第一数据的分布特征。
本申请实施例中的模型训练处理装置可以是电子设备,例如具有操作系统的电子设备,也可以是电子设备中的部件,例如集成电路或芯片。该电子设备可以是终端,也可以为除终端之外的其他设备。示例性的,终端可以包括但不限于上述所列举的终端11的类型,其他设备可以为服务器、网络附属存储器(Network Attached Storage,NAS)等,本申请实施例不作具体限定。
本申请实施例提供的模型训练处理装置能够实现图3至图8的方法实施例实现的各个过程,并达到相同的技术效果,为避免重复,这里不再赘述。
可选的,如图12所示,本申请实施例还提供一种通信设备1200,包括处理器1201和存储器1202,存储器1202上存储有可在所述处理器1201上运行的程序或指令,该程序或指令被处理器1201执行时实现上述模型训练处理方法实施例的各个步骤,且能达到相同的技术效果,为避免重复,这里不再赘述。
本申请实施例还提供一种终端,包括处理器和通信接口,在终端为第一设备时,所述处理器用于获取第一信息,所述第一信息包括第一数据;利用第一模型对第一数据进行处理,获得第二数据;其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量;
或者,在终端为第一设备时,所述通信接口用于向第一设备发送第二信息,所述第二信息包括第一模型,所述第一模型用于所述第一设备基于第一数据获得第二数据;其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
或者,在终端为第一设备时,所述通信接口用于向第一设备发送第一信息,所述第一信息包括第一数据,所述第一数据用于所述第一设备基于第一模型获得第二数据;其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
该终端实施例与上述终端侧方法实施例对应,上述方法实施例的各个实施过程和实现方式均可适用于该终端实施例中,且能达到相同的技术效果。具体地,图13为实现本申请实施例的一种终端的硬件结构示意图。
该终端1300包括但不限于:射频单元1301、网络模块1302、音频输出单元1303、输入单元1304、传感器1305、显示单元1306、用户输入单元1307、接口单元1308、存储器1309以及处理器1310等中的至少部分部件。
本领域技术人员可以理解,终端1300还可以包括给各个部件供电的电源(比如电池),电源可以通过电源管理系统与处理器1310逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。图13中示出的终端结构并不构成对终端的限定,终端可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置,在此不再赘述。
应理解的是,本申请实施例中,输入单元1304可以包括图形处理单元(Graphics Processing Unit,GPU)13041和麦克风13042,图形处理器13041对在视频捕获模式或图像捕获模式中由图像捕获装置(如摄像头)获得的静态图片或视频的图像数据进行处理。显示单元1306可包括显示面板13061,可以采用液晶显示器、有机发光二极管等形式来配置显示面板13061。用户输入单元1307包括触控面板13071以及其他输入设备13072中的至少一种。触控面板13071,也称为触摸屏。触控面板13071可包括触摸检测装置和触摸控制器两个部分。其他输入设备13072可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆,在此不再赘述。
本申请实施例中,射频单元1301接收来自网络侧设备的下行数据后,可以传输给处理器1310进行处理;另外,射频单元1301可以向网络侧设备发送上行数据。通常,射频单元1301包括但不限于天线、放大器、收发信机、耦合器、低噪声放大器、双工器等。
存储器1309可用于存储软件程序或指令以及各种数据。存储器1309可主要包括存储 程序或指令的第一存储区和存储数据的第二存储区,其中,第一存储区可存储操作系统、至少一个功能所需的应用程序或指令(比如声音播放功能、图像播放功能等)等。此外,存储器1309可以包括易失性存储器或非易失性存储器,或者,存储器1309可以包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synch link DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DRRAM)。本申请实施例中的存储器1309包括但不限于这些和任意其它适合类型的存储器。
处理器1310可包括一个或多个处理单元;可选的,处理器1310集成应用处理器和调制解调处理器,其中,应用处理器主要处理涉及操作系统、用户界面和应用程序等的操作,调制解调处理器主要处理无线通信信号,如基带处理器。可以理解的是,上述调制解调处理器也可以不集成到处理器1310中。
其中,在终端为第一设备时,所述处理器1310用于获取第一信息,所述第一信息包括第一数据;利用第一模型对第一数据进行处理,获得第二数据;其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量;
或者,在终端为第一设备时,所述射频单元1301用于向第一设备发送第二信息,所述第二信息包括第一模型,所述第一模型用于所述第一设备基于第一数据获得第二数据;其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
或者,在终端为第一设备时,所述射频单元1301用于向第一设备发送第一信息,所述第一信息包括第一数据,所述第一数据用于所述第一设备基于第一模型获得第二数据;其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
本申请实施例还提供一种网络侧设备,包括处理器和通信接口,在网络侧设备为第一设备时,所述处理器用于获取第一信息,所述第一信息包括第一数据;利用第一模型对第一数据进行处理,获得第二数据;其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量;
或者,在网络侧设备为第一设备时,所述通信接口用于向第一设备发送第二信息,所述第二信息包括第一模型,所述第一模型用于所述第一设备基于第一数据获得第二数据;其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
或者,在网络侧设备为第一设备时,所述通信接口用于向第一设备发送第一信息,所述第一信息包括第一数据,所述第一数据用于所述第一设备基于第一模型获得第二数据;其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
该网络侧设备实施例与上述网络侧设备方法实施例对应,上述方法实施例的各个实施过程和实现方式均可适用于该网络侧设备实施例中,且能达到相同的技术效果。
具体地,本申请实施例还提供了一种网络侧设备。如图14所示,该网络侧设备1400包括:天线1401、射频装置1402、基带装置1403、处理器1404和存储器1405。天线1401与射频装置1402连接。在上行方向上,射频装置1402通过天线1401接收信息,将接收 的信息发送给基带装置1403进行处理。在下行方向上,基带装置1403对要发送的信息进行处理,并发送给射频装置1402,射频装置1402对收到的信息进行处理后经过天线1401发送出去。
以上实施例中网络侧设备执行的方法可以在基带装置1403中实现,该基带装置1403包括基带处理器。
基带装置1403例如可以包括至少一个基带板,该基带板上设置有多个芯片,如图14所示,其中一个芯片例如为基带处理器,通过总线接口与存储器1405连接,以调用存储器1405中的程序,执行以上方法实施例中所示的网络设备操作。
该网络侧设备还可以包括网络接口1406,该接口例如为通用公共无线接口(common public radio interface,CPRI)。
具体地,本发明实施例的网络侧设备1400还包括:存储在存储器1405上并可在处理器1404上运行的指令或程序,处理器1404调用存储器1405中的指令或程序执行图9至11所示各模块执行的方法,并达到相同的技术效果,为避免重复,故不在此赘述。
本申请实施例还提供一种可读存储介质,所述可读存储介质上存储有程序或指令,该程序或指令被处理器执行时实现上述模型训练处理方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
其中,所述处理器为上述实施例中所述的终端中的处理器。所述可读存储介质,包括计算机可读存储介质,如计算机只读存储器ROM、随机存取存储器RAM、磁碟或者光盘等。
本申请实施例另提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现上述模型训练处理方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
应理解,本申请实施例提到的芯片还可以称为系统级芯片,系统芯片,芯片系统或片上系统芯片等。
本申请实施例另提供了一种计算机程序/程序产品,所述计算机程序/程序产品被存储在存储介质中,所述计算机程序/程序产品被至少一个处理器执行以实现上述模型训练处理方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素, 而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。此外,需要指出的是,本申请实施方式中的方法和装置的范围不限按示出或讨论的顺序来执行功能,还可包括根据所涉及的功能按基本同时的方式或按相反的顺序来执行功能,例如,可以按不同于所描述的次序来执行所描述的方法,并且还可以添加、省去、或组合各种步骤。另外,参照某些示例所描述的特征可在其他示例中被组合。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以计算机软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保护的范围情况下,还可做出很多形式,均属于本申请的保护之内。

Claims (29)

  1. 一种模型训练处理方法,包括:
    第一设备获取第一信息,所述第一信息包括第一数据;
    所述第一设备利用第一模型对第一数据进行处理,获得第二数据;
    其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
  2. 根据权利要求1所述的方法,其中,所述第一设备利用第一模型对第一数据进行处理之前,所述方法还包括:
    所述第一设备从第二设备接收第二信息,所述第二信息包括所述第一模型。
  3. 根据权利要求2所述的方法,其中,所述第二信息还包括配置信息和第一辅助信息中的至少一项,其中,所述配置信息用于指示所述第一模型的使用方式,所述第一辅助信息包括统计信息以及所述第一模型运行所需的环境信息,所述统计信息用于表示所述第一模型的输入的分布特征。
  4. 根据权利要求2所述的方法,其中,所述第一设备从第二设备接收第二信息之前,所述方法还包括:
    所述第一设备向第二设备发送第一请求消息,所述第一请求消息用于请求获取所述第二信息。
  5. 根据权利要求1所述的方法,其中,所述第一设备利用第一模型对第一数据进行处理,获得第二数据之后,所述方法还包括:
    所述第一设备基于所述第二数据对所述第二模型进行训练,得到第三模型。
  6. 根据权利要求5所述的方法,其中,所述第一设备基于所述第二数据对所述第二模型进行训练,得到第三模型之后,所述方法还包括:
    所述第一设备向第二设备发送所述第三模型。
  7. 根据权利要求1所述的方法,其中,所述第一设备获取第一信息包括以下任一项:
    所述第一设备从第二设备接收第一信息;
    所述第一设备从本地获取第一信息。
  8. 根据权利要求7所述的方法,其中,所述第一设备从第二设备接收第一信息之前,所述方法还包括:
    所述第一设备向所述第二设备发送指示信息,所述指示信息用于指示所述第二设备发送所述第一信息。
  9. 根据权利要求7所述的方法,其中,所述第一设备从第二设备接收第一信息之前,所述方法还包括:
    所述第一设备从所述第二设备接收第二请求消息,所述第二请求消息用于所述第二设 备请求发送所述第一信息。
  10. 根据权利要求7所述的方法,其中,在所述第一设备从第二设备接收第一信息的情况下,所述第一设备利用第一模型对第一数据进行处理,获得第二数据之后,所述方法还包括:
    所述第一设备向所述第二设备发送第三信息,所述第三信息包括所述第二数据。
  11. 根据权利要求10所述的方法,其中,所述第三信息还包括标识信息,所述标识信息用于指示所述第二数据基于所述第一模型获得。
  12. 根据权利要求1所述的方法,其中,所述第一信息还包括第二辅助信息,所述第二辅助信息用于表示所述第一数据的分布特征。
  13. 一种模型训练处理方法,包括:
    第二设备向第一设备发送第二信息,所述第二信息包括第一模型,所述第一模型用于所述第一设备基于第一数据获得第二数据;
    其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
  14. 根据权利要求13所述的方法,其中,所述第二信息还包括配置信息和第一辅助信息中的至少一项,其中,所述配置信息用于指示所述第一模型的使用方式,所述第一辅助信息包括统计信息以及所述第一模型运行所需的环境信息,所述统计信息用于表示所述第一模型的输入的分布特征。
  15. 根据权利要求13所述的方法,其中,所述第一设备从第二设备接收第二信息之前,所述方法还包括:
    所述第二设备从所述第一设备接收第一请求消息,所述第一请求消息用于请求获取所述第二信息。
  16. 一种模型训练处理方法,包括:
    第二设备向第一设备发送第一信息,所述第一信息包括第一数据,所述第一数据用于所述第一设备基于第一模型获得第二数据;
    其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
  17. 根据权利要求16所述的方法,其中,所述第二设备向第一设备发送第一信息之后,所述方法还包括:
    所述第二设备从第一设备接收第三模型,所述第三模型所述第一设备基于所述第二数据对所述第二模型进行训练得到。
  18. 根据权利要求16所述的方法,其中,所述第二设备向第一设备发送第一信息之后,所述方法还包括:
    所述第二设备从第一设备接收第三信息,所述第三信息包括所述第二数据。
  19. 根据权利要求18所述的方法,其中,所述第三信息还包括标识信息,所述标识信息用于指示所述第二数据基于所述第一模型获得。
  20. 根据权利要求18所述的方法,其中,所述第二设备从第一设备接收第三信息之后,所述方法还包括:
    所述第二设备基于所述第二数据对所述第二模型进行训练,获得第三模型。
  21. 根据权利要求16所述的方法,其中,所述第二设备向第一设备发送第一信息之前,所述方法还包括:
    所述第二设备从所述第一设备接收指示信息,所述指示信息用于指示所述第二设备发送所述第一信息。
  22. 根据权利要求16所述的方法,其中,所述第二设备向第一设备发送第一信息之前,所述方法还包括:
    所述第二设备向所述第一设备发送第二请求消息,所述第二请求消息用于所述第二设备请求发送所述第一信息。
  23. 根据权利要求16所述的方法,其中,所述第一信息还包括第二辅助信息,所述第二辅助信息用于表示所述第一数据的分布特征。
  24. 一种模型训练处理装置,包括:
    获取模块,用于获取第一信息,所述第一信息包括第一数据;
    处理模块,用于利用第一模型对第一数据进行处理,获得第二数据;
    其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
  25. 一种模型训练处理装置,包括:
    第二发送模块,用于向第一设备发送第二信息,所述第二信息包括第一模型,所述第一模型用于所述第一设备基于第一数据获得第二数据;
    其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
  26. 一种模型训练处理装置,包括:
    第三发送模块,用于向第一设备发送第一信息,所述第一信息包括第一数据,所述第一数据用于所述第一设备基于第一模型获得第二数据;
    其中,所述第一数据和所述第二数据均可用于训练第二模型,所述第二模型为业务模型,且所述第二数据满足以下至少一项:在所述第一数据为未带标签的数据的情况下,所述第二数据为带标签的数据;在所述第一数据为带标签数据的情况下,所述第二数据的数据量大于所述第一数据的数据量。
  27. 一种终端,包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,其中,所述程序或指令被所述处理器执行时实现如权利要求1至23任一项所述的模型训练处理方法的步骤。
  28. 一种网络侧设备,包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,其中,所述程序或指令被所述处理器执行时实现如权利要求1至23任一项所述的模型训练处理方法的步骤。
  29. 一种可读存储介质,所述可读存储介质上存储程序或指令,其中,所述程序或指令被处理器执行时实现如权利要求1至23任一项所述的模型训练处理方法的步骤。
PCT/CN2023/092028 2022-05-06 2023-05-04 模型训练处理方法、装置、终端及网络侧设备 Ceased WO2023213270A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP23799256.5A EP4521309A4 (en) 2022-05-06 2023-05-04 Model training processing method, device, terminal, and network-side device
US18/935,694 US20250061381A1 (en) 2022-05-06 2024-11-04 Model training processing method and apparatus, terminal, and network side device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210489247.9A CN117093858A (zh) 2022-05-06 2022-05-06 模型训练处理方法、装置、终端及网络侧设备
CN202210489247.9 2022-05-06

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/935,694 Continuation US20250061381A1 (en) 2022-05-06 2024-11-04 Model training processing method and apparatus, terminal, and network side device

Publications (1)

Publication Number Publication Date
WO2023213270A1 true WO2023213270A1 (zh) 2023-11-09

Family

ID=88646297

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/092028 Ceased WO2023213270A1 (zh) 2022-05-06 2023-05-04 模型训练处理方法、装置、终端及网络侧设备

Country Status (4)

Country Link
US (1) US20250061381A1 (zh)
EP (1) EP4521309A4 (zh)
CN (1) CN117093858A (zh)
WO (1) WO2023213270A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111460156A (zh) * 2020-03-31 2020-07-28 深圳前海微众银行股份有限公司 样本扩充方法、装置、设备及计算机可读存储介质
US20200342329A1 (en) * 2019-04-25 2020-10-29 Sap Se Architecture search without using labels for deep autoencoders employed for anomaly detection
CN111860424A (zh) * 2020-07-30 2020-10-30 厦门熵基科技有限公司 一种可见光手掌识别模型的训练方法和装置
CN113965993A (zh) * 2020-07-21 2022-01-21 维沃移动通信有限公司 直接通信启动控制方法及相关设备
CN114363921A (zh) * 2020-10-13 2022-04-15 维沃移动通信有限公司 Ai网络参数的配置方法和设备

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107832305A (zh) * 2017-11-28 2018-03-23 百度在线网络技术(北京)有限公司 用于生成信息的方法和装置
WO2020219971A1 (en) * 2019-04-25 2020-10-29 Google Llc Training machine learning models using unsupervised data augmentation
CN112702751A (zh) * 2019-10-23 2021-04-23 中国移动通信有限公司研究院 无线通信模型的训练和升级方法、网络设备及存储介质
CN111161740A (zh) * 2019-12-31 2020-05-15 中国建设银行股份有限公司 意图识别模型训练方法、意图识别方法以及相关装置
WO2021163895A1 (zh) * 2020-02-18 2021-08-26 Oppo广东移动通信有限公司 网络模型的管理方法及建立或修改会话的方法、装置
CN111460150B (zh) * 2020-03-27 2023-11-10 北京小米松果电子有限公司 一种分类模型的训练方法、分类方法、装置及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200342329A1 (en) * 2019-04-25 2020-10-29 Sap Se Architecture search without using labels for deep autoencoders employed for anomaly detection
CN111460156A (zh) * 2020-03-31 2020-07-28 深圳前海微众银行股份有限公司 样本扩充方法、装置、设备及计算机可读存储介质
CN113965993A (zh) * 2020-07-21 2022-01-21 维沃移动通信有限公司 直接通信启动控制方法及相关设备
CN111860424A (zh) * 2020-07-30 2020-10-30 厦门熵基科技有限公司 一种可见光手掌识别模型的训练方法和装置
CN114363921A (zh) * 2020-10-13 2022-04-15 维沃移动通信有限公司 Ai网络参数的配置方法和设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4521309A4

Also Published As

Publication number Publication date
CN117093858A (zh) 2023-11-21
US20250061381A1 (en) 2025-02-20
EP4521309A4 (en) 2025-07-30
EP4521309A1 (en) 2025-03-12

Similar Documents

Publication Publication Date Title
WO2023143572A1 (zh) 基于人工智能ai模型的定位方法及通信设备
US20240348511A1 (en) Data processing method and apparatus, and communication device
CN116234001A (zh) 定位方法及通信设备
WO2023125747A1 (zh) 模型训练方法、装置及通信设备
CN113645637B (zh) 超密集网络任务卸载方法、装置、计算机设备和存储介质
WO2023125594A1 (zh) Ai模型传输方法、装置、设备及存储介质
CN117896714A (zh) 模型选择方法、终端及网络侧设备
WO2023098534A1 (zh) Ai模型切换的处理方法、装置及通信设备
WO2024088119A1 (zh) 数据处理方法、装置、终端及网络侧设备
WO2024120445A1 (zh) 模型输入信息的确定方法、装置、设备、系统及存储介质
WO2024067281A1 (zh) Ai模型的处理方法、装置及通信设备
US20250299109A1 (en) Information transmission method, information transmission apparatus, and communication device
WO2024032694A1 (zh) Csi预测处理方法、装置、通信设备及可读存储介质
US20250148297A1 (en) Federated learning method and apparatus, communication device, and readable storage medium
WO2023213270A1 (zh) 模型训练处理方法、装置、终端及网络侧设备
CN118504646A (zh) 模型监督的方法、装置及通信设备
WO2023179653A1 (zh) 波束处理方法、装置及设备
WO2024083004A1 (zh) Ai模型配置方法、终端及网络侧设备
WO2023155839A1 (zh) Ai模型的在线学习方法、装置、通信设备及可读存储介质
WO2024061287A1 (zh) 人工智能ai模型传输方法、装置、终端及介质
US20250343740A1 (en) Information transmission method and apparatus, and device
CN118487952A (zh) Ai模型指示方法及通信设备
CN118283643A (zh) 数据集确定方法、信息传输方法、装置和通信设备
WO2023098585A1 (zh) 信息传输方法、装置及通信设备
WO2024235043A1 (zh) 信息获取方法、装置及通信设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23799256

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023799256

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2023799256

Country of ref document: EP

Effective date: 20241206