CN111224905B

CN111224905B - A Multi-User Detection Method Based on Convolutional Residual Networks in Large-scale Internet of Things

Info

Publication number: CN111224905B
Application number: CN201911358559.0A
Authority: CN
Inventors: 李国兵; 吴萧江; 张国梅
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2019-12-25
Filing date: 2019-12-25
Publication date: 2021-07-13
Anticipated expiration: 2039-12-25
Also published as: CN111224905A

Abstract

The invention discloses a multi-user detection method based on a convolutional residual network in a large-scale Internet of Things. It is assumed that the reception of all frames is synchronous, and each user uses a different pilot frequency sequence for channel estimation. The pilot sequence matrix, channel vector and user activity are placed in the same matrix for optimization to obtain the signal received by the base station; the centralized layer convolutional neural network CLCNN is designed to detect the activated user with the deep learning method, and the received signal is input The deep learning network estimates the activation vector; collects or generates data, divides it into a training set and a test set, inputs the received signal into the network, uses the stochastic gradient descent optimizer for optimization training, calculates the loss function and detection accuracy, and converts large-scale The channel estimation of the access is simplified to the channel estimation of a small number of active users, and the minimum mean square error method is used to estimate the channel to complete the multi-user detection. The invention has the characteristics of high performance, fast training and detection speed, and complete practicability.

Description

Multi-user detection method based on convolution residual error network in large-scale Internet of things

Technical Field

The invention belongs to the technical field of wireless communication, and particularly relates to a convolution residual error network-based multi-user detection method in a large-scale Internet of things.

Background

The Internet of Things (IoT) is regarded as a major development and revolution opportunity in the information field and is expected to bring about revolutionary revolution. IoT also differs significantly from traditional communications. For example, a feature of the large-scale internet of things is that a large number of users sporadically transmit very small packets, unlike conventional cellular communications.

User detection of the large-scale internet of things is a challenging problem faced by the internet of things. The access of a large number of users makes the complexity of the traditional user detection scheme too high to use, and the sparsity of the large-scale internet of things is a key point of the network. The current solutions to sparsity are mainly Compressed Sensing (CS) and Deep Learning (DL). Although compressed sensing can reduce complexity by exploiting sparsity, there is still much room for improvement in performance and complexity.

Deep learning is a deep neural network, i.e., a deep neural network. The deep learning depends on a nonlinear deep network structure containing a large number of neurons, and can approximate a complex function, so that the capability of learning essential characteristics of a data set from a sample set is realized. Deep learning currently achieves great success but is accompanied by a number of challenges. In terms of application, deep learning is mainly focused on the fields of images, natural language processing and the like, and the popular convolutional neural network is mainly paid attention to in these fields because of great success. The application of deep learning in various fields is continuously developed, but the development of a wide range and a specific network is still limited. The application of deep learning to the communication field is a promising research, and the research on the network structure applicable to the communication field can further promote the development of the communication industry in a new era.

Disclosure of Invention

The technical problem to be solved by the present invention is to provide a multi-user detection method based on a convolution residual network in a large-scale internet of things, aiming at the defects in the prior art, wherein the method has better performance in multi-user detection and channel estimation, and can achieve convergence faster in a network training process.

The invention adopts the following technical scheme:

a multi-user detection method based on a convolution residual error network in a large-scale Internet of things comprises the following steps:

s1, assuming that the reception of all frames is synchronous, each user uses a different pilot sequence S_kMake channel estimation, K1.. K, matrix s of pilot sequences of all users_kChannel vector h_kAnd user activity a_kRespectively putting the signals in the same matrix for optimization to obtain a signal y received by the base station;

s2, designing a concentrated layer convolutional neural network CLCNN, detecting an activated user by using a deep learning method, and inputting a received signal y into the deep learning network to estimate an activation vector a;

and S3, collecting or generating data, dividing the data into a training set and a test set, inputting the received signal y into a network, performing optimization training by using a random gradient descent optimizer, calculating a loss function and detection accuracy, simplifying large-scale accessed channel estimation into channel estimation of a small number of activated users, estimating a channel by adopting a least mean square error method, and completing multi-user detection.

Specifically, in step S1, the signal y received by the base station is:

y＝SAh+n＝Sx+n

where S is, yes, x ═ Ah is a block sparse vector of n nonzero blocks corresponding to active users; n is additive white gaussian noise.

Further, x is reconstructed by y, and simultaneously the active users are detected and the channels of the users are estimated, and the optimization problem is as follows:

wherein | · | purple sweet₀Representing a norm of 0, L being the dimension of the channel vector, and n being the maximum number of active users.

Specifically, in step S2, the CLCNN network structure uses a convolution layer as a computation layer, the convolution layer uses one-dimensional convolution, the convolution kernel size of the first layer is 7, the remainder is 3, and the number of channels increases stepwise from 64 to 1028; the step length of the pooling layer is 2; the number of the neurons of the full connection network layer is 1000; the continuous lines of identity are direct additions, the dashed lines are dimensional changes, and dimension transformation is achieved with convolution layers with a convolution kernel size of 1.

Further, the regularized objective function of the CLCNN network is as follows:

J_final(θ)＝J_loss(θ)+aΩ(θ)

where θ is the set of all parameters, α is the regularization coefficient, Ω (θ) is the parameter norm penalty, J_loss(θ) is an objective function.

Further, the loss function of the CLCNN network is:

J_final(θ)＝(a_output-a_actual)²+a∑_iW_i ²

wherein, a_outputIs the output of the network, whose elements are decimals, a_actualIs the actual activation vector of the user, W_iIs a weight parameter of the network.

Specifically, step S3 specifically includes:

s301, collecting or generating dataThe set contains a pilot sequence matrix S_allA received signal y of a large number of samples and an active user vector a; dividing a data set into a training set and a testing set, wherein the ratio is 10: 1, the training set is used for training a network, and the testing set is used for checking whether the network is trained successfully;

s302, in the network training process, inputting the received signal y into the network, and calculating the output a_outputWith the activation vector a of the data set samples_actualI.e. the loss function J (θ) ═ a_output-a_actual)²Then, performing optimization training by using a random gradient descent optimizer;

s303, in the testing process, inputting the test sample into a network, and calculating a loss function and a detection accuracy;

s304, after the activation vector a is estimated, simplifying the channel estimation of large-scale access into the channel estimation of a small number of activation users, and estimating the channel by adopting a minimum mean square error method.

Further, in step S301, the number of active users in the total number of users K is n, the value of n and the number of active users are randomly distributed, and a vector a describing the activation of the user is { a ═ a }_kIs a vector of dimension K, where the random n values are 1 and the remaining values are 0; the signal transmitted by the active subscriber being N modulated by binary phase shift keying_sA length pilot sequence; the pilot sequence length of each user is fixed to N_sThe model of the channel is a Gaussian channel with L independent and same distribution; randomly generating a pilot sequence, K > (N)_sThe multi-user detection problem at + L-1)/L is underdetermined.

Further, in step S302, the training data uses data for activating an upper limit on the number n of users.

Specifically, the multi-user detection method comprises a base station and a plurality of users, the access mode adopts grant-free, the plurality of users directly send signals to the base station without applying for the signals, and only a few n users send signals in total K users in the same frame.

Compared with the prior art, the invention has at least the following beneficial effects:

the invention relates to a multi-user detection method based on a convolution residual error network in a large-scale Internet of things, which redesigns a standard convolution residual error network, adopts a convolution computing layer identity connection structure, and redesigns a network structure at the same time, so that the network in the invention exceeds a full-connection network, the standard residual error network and a compressed sensing algorithm in performance and achieves the optimum in all schemes of research.

Furthermore, the CLCNN network structure adopts convolutional layers, which have the advantages of multiplexing the weight of convolutional layers, so that the performance of the CLCNN network structure far exceeds that of a fully connected network, and the number of parameters is much less than that of the fully connected network. Meanwhile, due to the redesign of the network structure, the network convergence speed far exceeds that of a standard residual error network, and the method can be well applied to scenes.

Furthermore, by setting a reasonable regularization coefficient alpha, overfitting can be reduced through regularization, the test accuracy of the network is improved, and the network has better accuracy when in use.

Furthermore, the scene of the internet of things adopts a method of directly transmitting the grant-free in a single direction, and is simpler than methods requiring base station management such as free access. In the detection process, the network is used for detecting the users firstly, and then the large-scale access problem is reduced to the channel estimation problem of a small number of users, so that compared with a mode of directly estimating the channel by the network, the required training data is the activation vector of the users instead of the channel vector, and the method has complete realizability. Some of the paper's schemes use data of channel vectors, which is not practical because the channel vectors are agnostic.

In conclusion, the detection method has the characteristics of high performance, high training and detection speed and complete feasibility.

The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.

Drawings

FIG. 1 is a diagram of a system model of the present invention;

FIG. 2 is a block diagram of a deep network according to the present invention;

FIG. 3 is an overall block diagram of the deep network of the present invention;

FIG. 4 is a graph of the loss function of the inventive scheme during training as a function of the number of trains;

FIG. 5 is a comparison graph of user detection accuracy varying with the number of active users in different schemes;

fig. 6 is a comparison graph of channel accuracy rate as a function of the number of active users for different schemes.

Detailed Description

Referring to fig. 1, the method for detecting multiple users based on a convolutional residual error network in a large-scale internet of things of the present invention includes a base station and multiple users, an access method employs a grant-free, the multiple users directly send signals to the base station without applying for the signals, and only a few n users among K users in total in the same frame send signals. The method comprises the following steps:

s1, assuming that the reception of all frames is synchronous, each user uses a different pilot sequence S_kAnd (K is 1.. K.) for channel estimation, and the channel vector of the user K is recorded as h_k∈C^LWhere C represents a set of complex numbers. Pilot sequence s_kAnd channel h_kThe convolution y SAh + n Sx + n may be written as a matrix s_kAnd vector h_kProduct of (a), S_kMatrix is composed of_kSequence shift constructs, as follows:

signals received by the base station

Wherein, a_kE {0,1} indicates whether user k is active, a_k1 indicates that the user has transmitted a signal in this frame, a_k0 denotes user silence; n is additive white gaussian noise.

Matrix s pilot sequences of all users_kChannel vector h_kAnd user activity a_kAre respectively placed in the same matrix, and then,

A＝diag(a₁I,...,a_KI)∈Z^KL×KLwhere diag (-) denotes turning the vector into a diagonal matrix.

Formula (2) may be written as y SAh + n Sx + n (3)

Where x-Ah is a block sparse vector of n non-zero blocks corresponding to active users, x is reconstructed with y while active users are detected and their channels are estimated. Thus, the problem is attributed to solving the following optimization problem:

wherein | · | purple sweet₀Representing a 0 norm, i.e., the number of non-zero elements.

the deep Neural Network uses a Centralized Layers Convolutional Neural Network (CLCNN) disclosed by the present invention, and the CLCNN Network and the overall structure are shown in fig. 2 and fig. 3. The standard ResNet network has good performance in image classification, for example, as shown in the figure, the ResNet network has poor performance when being directly used due to certain difference between the scene of the Internet of things and the image classification. The new network CLCNN disclosed by the invention redesigns the residual network block and the whole network structure according to a certain criterion. The designed network has faster training speed and better performance.

Referring to fig. 2, the network block diagram of the deep network according to the present invention is specifically described as follows:

the design criteria of CLCNN make the network layer more centralized

Firstly, for the neuron, weight calculation (convolution layer here) and activation function are integrated, and are two parts of a neuron cell, and the complete neuron cell structure is favorable for information transfer, while the calculation BatchNorm of normalization is added in a neuron cell to divide the neuron into two halves, which neither accords with biological characteristics, nor makes information change in the neuron, but also hinders information transfer to a certain extent. The invention thus removes the BatchNorm and identity connection between the convolutional layer and the activation function from the neuronal cell to ensure the integrity of the neuronal cell.

Secondly, the information transmission is facilitated by gathering the same type of neurons and putting an additional layer such as BatchNorm at the end. This is like the tissue structure of a living being, consisting of cells of the same kind. Therefore, the second criterion of the CLCNN is to centralize the same type of neurons as much as possible, and the classical network such as LeNet always places all convolutional layers in front and fully-connected layers in back, which proves the criterion from the side.

The third criterion of CLCNN is that the identity of each network block includes as much as possible all layers of the network block (including the activation function) without any computation layers between different network blocks. After the activation function Relu is used in AlexNet, the activation function Relu is favored by deep learning research and development personnel because the gradient diffusion can be greatly reduced compared with the Sigmoid function, the gradient of which the positive part is 1 enables the positive part not to generate gradient diffusion and gradient explosion, but the gradient of which the negative part is 0 can cause gradient diffusion. The invention puts the Relu function of ResNet on the outer side of the constant connection on the inner side of the constant connection, thereby reducing the problem of gradient dispersion and ensuring that the network training speed is faster. The improved residual net block is shown in fig. 2.

The convolution kernel size in the figure is 3. BatchNorm acts to normalize the data. Relu is an activation function, y ═ max (x, 0).

Referring to fig. 3, the overall structure diagram of the deep network according to the present invention is described as follows:

the CLCNN network structure adopts the convolution layer as a computing layer, which is much better than BRNN adopting a full connection layer. A wider network may achieve better performance if there is enough data, so starting with the first network segment, the number of convolution kernels is doubled to improve the fit ability, while reducing the feature map size, so that more features are extracted and a feature is more characterized for better performance. The redesigned network keeps the same number of layers in different network segments to facilitate design and does not require much time to adjust the number of network layers.

The convolution layer uses one-dimensional convolution, the convolution kernel size of the first layer is 7, the rest are 3, and the number of channels (the number of convolution kernels) is increased by multiple from 64 to 1028 sections. The pooling layer is down-sampled by a step size of 2. The number of neurons in the fully connected network layer is 1000. The solid lines of the identity connections are direct additions and the dashed lines are dimensional changes, where the dimensional transformation is implemented with convolution layers with a convolution kernel size of 1.

Regularization

In deep learning detection, overfitting often occurs. The errors of the learner on the training set and the test set are referred to as training errors and generalization errors, respectively. Overfitting means overfitting the training sample, the training error is small, but the generalization error is large. Under-fitting is that the fitting ability of the learner is not enough, and the training error and the generalization error are both large.

The reason for overfitting is that the amount of data is too small or the neural network is too large, and thus the solution is to increase the amount of data or decrease the network. Regularization is to an objective function J_lossAnd (theta) adding a parameter norm penalty omega (theta) to limit the learning capability of the learner. Regularized objective function

J_final(θ)＝J_loss(θ)+aΩ(θ) (6)

Where θ is the set of all parameters and α is the regularization coefficient. Here, the minimum Mean Square Error (MSE) is used as the loss and the 2-norm function is used as the penalty factor. Thus, the net final loss function of the invention:

J_final(θ)＝(a_output-a_actual)²+a∑_iW_i ² (7)

wherein, a_outputIs the output of the network, whose elements are decimals, a_actualIs the actual activation vector of the user, whose elements are 1 or 0. W_iIs a weight parameter of the network. Overfitting is greatly reduced after regularization.

S3, multiuser detection

S301, collecting or generating data, the data set containing a pilot sequence matrix S_allA received signal y of a large number of samples and an active user vector a; dividing a data set into a training set and a testing set, wherein the ratio is 10: 1, the training set is used for training a network, and the testing set is used for checking whether the network is trained successfully;

the number of activated users in the total number of users K is n, and the value of n and the distribution of the activated users are random. That is, the vector a ═ a describing the user activation case_kIs a vector of dimension K, where random N values are 1 and the remaining values are 0, the signal transmitted by the active user is N modulated with Binary Phase Shift Keying (BPSK)_sThe length pilot sequence. The pilot sequence length of each user is fixed to N_sThe model of the channel is a gaussian channel with L independent co-distribution. The pilot sequence is randomly generated, K > (N)_sThe multi-user detection problem at + L-1)/L is underdetermined. The number of users K in the scene is large, and the length N of the pilot sequence_sBut is very limited and thus the problem is less qualitative than normal.

S302, in the network training process, inputting the received signal y into the network, and calculating the output a_outputWith the activation vector a of the data set samples_actualI.e. the loss function J (θ) ═ a_output-a_actual)²Then, performing optimization training by using a stochastic gradient descent optimizer (SGD);

the training data uses data that activates an upper limit of the number n of users, for example, n ranges from 1 to 6, and then n-6 data is used, that is, random n-6 users among K users are transmitting signals. After the training process is completed, the data of the number n of the activated users, which is 1-6, can be calculated by the network. The accuracy of calculating n-6 by using a network with n-6 is slightly lower than the performance of calculating the n value by using a network with a specific n value. For example, training a network with n-4 data and then calculating n-4 data is slightly higher in performance, but it is necessary to train 6 networks with n-1-6, estimate n in calculation, and when n is in a larger range, complexity is many times higher. And thus is trained using the n upper bound data.

the loss function during training as a function of the number of trains is shown in fig. 4.

And S304, after the activation vector a is estimated, simplifying the channel estimation of large-scale access into the channel estimation of a small number of activation users, and estimating the channel by adopting a Minimum Mean Square Error (MMSE) method.

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Simulation experiment

In the simulation, the total number K of users, 100, is the number n of active users ∈ {1,2, …,6}, and the value of n and the distribution of active users are random.

Vector a ═ a describing user activation_kIs a vector of dimension 100 with random n values of 1 and the remaining values of 0.

The signal transmitted by the active user is N modulated by Binary Phase Shift Keying (BPSK)_sThe length pilot sequence.

The pilot sequence length of each user is fixed to N_s＝40。

The model of the channel is a gaussian channel with L ═ 6 independent co-distribution.

The received noise n is an additive white gaussian noise vector with a signal-to-noise ratio of 10 dB.

For a signaling process y of Sx + n, the pilot sequence matrix S e C^(Ns+L-1)*KLX belongs to C through sparse channel^KLObtaining a received signal

The sparse channel x is estimated using the received signal y and the known pilot sequence matrix S. The pilot sequence is randomly generated, K > (N)_sThe multi-user detection problem at + L-1)/L is underdetermined. The number of users K in the scene is large, and the length N of the pilot sequence_sThere is a limit and thus the problem is less qualitative than normal. The proposed CLCNN network is compared to the BRNN network and the compressed sensing algorithm, including Orthogonal Matching Pursuit (OMP), Iterative Hard Threshold (IHT), and block-structured orthogonal matching pursuit (BOMP).

Generated 8 x 10 in the simulation⁶A different training sample and 10⁵And (4) testing the sample. In the training process, data of the number n of active users, which is 6, that is, random n of 100 users, which is 6 users, is used to transmit signals.

After the training process is completed, the data of the number n of the activated users, which is 1-6, are calculated by the network. The accuracy of calculating n-6 by using a network with n-6 is slightly lower than the performance of calculating the n value by using a network with a specific n value. For example, training a network with n-4 data and then calculating n-4 data is slightly higher in performance, but 6 networks with n-1-6 need to be trained, the n value is estimated first in calculation, and when the n range is large, complexity is many times higher. The optimizer used for training the neural network in the experimental training process is a stochastic gradient descent optimizer (SGD), and the learning rate is 0.01. The batch size is fixed at 256.

FIG. 4 is a graph of loss function versus training times during training; the loss function is gradually reduced in the training process and is 2 multiplied by 10⁶batch is close to convergence; the test error is slightly worse than the training error, the generalization performance is good, and overfitting does not occur.

Fig. 5 is a comparison graph of user detection accuracy rate varying with the number of activated users in different schemes. The solution according to the invention performs best in all solutions. In the simulation results, DNN performance is inferior to BRNN, and standard ResNet is difficult to converge, so none of them are compared in the figure. The deep learning approach in all scenarios exceeds the CS approach, while the present invention in the deep learning scenario exceeds the BRNN network.

Fig. 6 is a graph comparing channel accuracy with the number of active users for different schemes. The result after using MMSE to estimate the channel is similar to the user detection accuracy, and the CLCNN network has optimal performance. The advantages of the deep learning scheme and the design of the network structure of the invention can realize better performance.

In summary, compared with the standard residual error network, the network training speed of the method is faster, and the standard residual error network is far from converging when the network of the invention has converged training times, so that the standard residual error network cannot be used due to too low training speed in the scene described by the invention.

Compared with the compressed sensing method, the method does not estimate n in advance, and the several classical methods of CS need to estimate n in advance. In the present experiment, n of the CS method is directly input as a known value, and if the process of estimating n is added, the CS method which is inferior in performance is lower in performance. Meanwhile, after the constant connection is used, hundreds of networks and even thousands of networks can be realized, and when the data volume is large enough, deeper networks can show better performance. The improvement of the network performance is only the problem of increasing the data volume and the layer number, while the improvement of the performance of the CS method usually needs a new algorithm with better performance, and the performance is more difficult to improve compared with the invention.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above-mentioned contents are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modification made on the basis of the technical idea of the present invention falls within the protection scope of the claims of the present invention.

Claims

1. a multi-user detection method based on a convolutional residual network in a large-scale Internet of Things, is characterized in that, comprises the following steps:

S1. Assuming that all frames are received synchronously, each user uses a different pilot sequence _sk for channel estimation, k=1,...,K, the pilot sequence matrix S _all and channel vector of all users are h _k and user activity a _k are respectively placed in the same matrix for optimization, the signal y received by the base station is obtained, and x is reconstructed with y, and the active users are detected and the channels of these users are estimated at the same time. The optimization problem is:

Among them, ||·|| ₀ represents the 0 norm, L is the dimension of the channel vector, and n is the maximum number of active users;

The signal y received by the base station is:

y=SAh+m=Sx+m

Among them, S is the matrix of the transmitted signal, x=Ah is the block sparse vector with n non-zero blocks corresponding to the active user; m is the additive white Gaussian noise;

S2. Design the centralized layer convolutional neural network CLCNN, use the deep learning method to detect the activated users, and input the received signal y into the deep learning network to estimate the activation vector a;

S3. Collect or generate data, which is divided into training set and test set, input the received signal y into the network, use the stochastic gradient descent optimizer for optimization training, calculate the loss function and detection accuracy, and estimate the large-scale access channel Simplified to the channel estimation of a small number of active users, the minimum mean square error method is used to estimate the channel, and the multi-user detection is completed, specifically:

S301. Collect or generate data, the data set includes a pilot sequence matrix S _all , a received signal y of a large number of samples, and an activated user vector a; the data set is divided into two parts, a training set and a test set, with a ratio of 10:1, and the training set It is used to train the network, and the test set is used to check whether the network is successfully trained;

S302. In the network training process, input the received signal y into the network, calculate the deviation between the output a _output and the activation vector a _actual of the dataset sample, that is, the loss function J(θ)=(a _output -a _actual ) ² , and then use Stochastic gradient descent optimizer for optimization training;

S303, in the testing process, input the test sample into the network, and calculate the loss function and the detection accuracy;

S304, after the activation vector a is estimated, simplify the channel estimation of large-scale access to the channel estimation of a small number of activated users, and use the minimum mean square error method to estimate the channel;

The multi-user detection method includes a base station and multiple users. The access method adopts grant-free. Multiple users directly send signals without applying to the base station. In the same frame, only a few n users are sending signals among the total K users.

2. method according to claim 1, is characterized in that, in step S2, CLCNN network structure adopts convolution layer as computation layer, convolution layer uses one-dimensional convolution, and the convolution kernel size of the first layer is 7, The remainder is 3, and the number of channels is multiplied from 64 to 1028; the step size of the pooling layer is 2; the number of neurons in the fully connected network layer is 1000; the solid line of the identity connection is the direct addition, and the dotted line is the dimension change , using a convolutional layer with a convolution kernel size of 1 to achieve dimension transformation, adding a parameter norm penalty Ω(θ) to the objective function J _loss (θ), limiting the learning ability of the learner, and completing the regularization of the CLCNN network.

3. method according to claim 2, is characterized in that, the objective function after CLCNN network regularization is:

J _final (θ)=J _loss (θ)+λΩ(θ)

where θ is the set of all parameters, λ is the regularization coefficient, Ω(θ) is the parameter norm penalty, and J _loss (θ) is the objective function.

4. method according to claim 2, is characterized in that, the loss function of CLCNN network is:

J _final (θ)=(a _output -a _actual ) ² +a∑ _i W _i ²

Among them, a _output is the output of the network, and its elements are decimals, a _actual is the actual activation vector of the user, and _Wi is the weight parameter of the network.

5. The method according to claim 1, wherein in step S301, the number of activated users in the total number of users K is n, and the value of n and the activated users are randomly distributed, and the vector a={a describing the user activation situation _k } is a vector of dimension K, where random n values are 1 and the rest are 0; the signal transmitted by the active user is a _Ns -length pilot sequence modulated with binary phase shift keying; the pilot sequence of each user The length is fixed as N _s , the model of the channel is L independent and identically distributed Gaussian channel; the pilot sequence is randomly generated, the multi-user detection problem is underdetermined when K > (N _s +L-1)/L, and L is the channel length .

6 . The method according to claim 1 , wherein, in step S302 , the training data uses the data of the upper limit of the number n of activated users. 7 .