CN109685204A

CN109685204A - Pattern search method and device, image processing method and device

Info

Publication number: CN109685204A
Application number: CN201811584647.8A
Authority: CN
Inventors: 郭梓超
Original assignee: Beijing Megvii Technology Co Ltd
Current assignee: Beijing Megvii Technology Co Ltd
Priority date: 2018-12-24
Filing date: 2018-12-24
Publication date: 2019-04-26
Anticipated expiration: 2038-12-24
Also published as: CN109685204B

Abstract

The present invention relates to the technical field of deep learning, and provides a model search method and device, and an image processing method and device. The model search method includes: constructing a structure to be searched, at least one edge is connected between any two connected nodes of the structure to be searched, and each edge in the at least one edge corresponds to a different candidate operation; training the structure to be searched, in each In the next iteration, the model obtained after retaining one edge between each two connected nodes is determined as the model to be trained in this iteration. If the model contains previously trained edges, these edges are directly migrated. Trained parameters; after the structure to be searched is trained, at least one available model is selected from the models included in the structure to be searched according to the test result of the model performance. This method has high efficiency in searching for models, and can cover a large search range to avoid missing valuable models.

Description

Pattern search method and device, image processing method and device

Technical field

The present invention relates to depth learning technology fields, at a kind of pattern search method and device, image Manage method and device.

Background technique

Convolutional neural networks are a kind of nowadays most common models in deep learning field, and in recent years, researcher is logical Hand-designed and experiment are crossed, constantly proposes performance more preferably convolutional neural networks model, such as AlexNet, VGG16, Inception, Resnet, Xception etc..

However, hand-designed method is very high to the ability and skill requirement of researcher, generally require to spend researcher Plenty of time and energy could design applicable model in a certain task or a certain data set.Also, the design side of model Formula is varied, and the model structure quantity that hand-designed method is considered is extremely limited, it is likely that can omit of good performance Model.

Summary of the invention

In view of this, the embodiment of the present invention provides a kind of pattern search method and device, image processing method and device, it can To bring a large amount of model into search range, and efficiently search out the model for meeting specific requirements.

To achieve the above object, the invention provides the following technical scheme:

In a first aspect, the embodiment of the present invention provides a kind of pattern search method, for searching for neural network model, comprising:

Structure to be searched is constructed, structure to be searched includes the oriented side of multiple nodes and the multiple nodes of connection, node Indicate unit data cached in neural network, the data of the start node caching when indicating this are handled by candidate operations It is input to the terminal node on the side afterwards, wherein at least one side is connected between any two connected node, at least one side Each edge correspond to different candidate operations；

Using the data training structure to be searched in training set, when each iteration in the training process, by every two phase Even retain the model obtained after a line in the side between node and be determined as the model to be trained of current iteration, if current iteration In the model to be trained comprising before the side of trained mistake in iteration, then trained housebroken side is corresponding Parameter be determined as initial parameter of the housebroken side in current iteration；

After structured training to be searched is good, according to the test result to model performance from the model that structure to be searched includes Select at least one available model, wherein the model that searching structure includes refers to will be in the side between every two connected node Retain the model obtained after a line.

It in the above-mentioned methods, is node and the digraph that side is constituted by Neural Networks Representation, node indicates in neural network The data of data cached unit, the start node caching when indicating this are input to the side after a certain operation processing Terminal node.

Node can be arbitrarily added in structure to be searched, can also arbitrarily add side between node, any the two of the structure At least one side is connected between a connected node, each edge therein corresponds to a kind of candidate operations, thus in structure to be searched A large amount of neural network can be accommodated, these neural networks include the side of common node and partial common.By building to Searching structure can carry out pattern search in a larger range, avoid omitting valuable model structure.

Meanwhile in the training process of the above method, if in the model to be trained of certain iteration comprising before in iteration The side of trained mistake, then using it is housebroken while corresponding trained parameter as it is housebroken while change at this For when initial parameter.It is equivalent to and has carried out the shared of parameter, after parameter sharing, model convergence rate is accelerated, and is conducive to improve Training effectiveness simultaneously improves training effect, and then improves the efficiency of pattern search and improve the result of pattern search.

In addition, the pattern search method the degree of automation is higher, user does not need to expend too many energy and designs a model structure, Searching method can automatic selectivity be able to satisfy the model structure of requirement.

In some embodiments, the model to be trained of current iteration be will be random in the side between every two connected node Retain the model obtained after a line.

In these embodiments, the model of training is obtained by way of retaining side at random, it is enough in the number of iterations When, the frequency of training of each edge in structure to be searched is approximately uniform, so as to ensure that each edge is all adequately trained, And the corresponding parameter in each side is all adequately shared.

Due to containing a large amount of model in structure to be searched, each edge of structure to be searched has all obtained sufficient instruction Practice, is equivalent to these models and has all obtained sufficient training, simultaneously as there are parameter sharings between each side, therefore only need Less frequency of training can obtain preferable training effect.

In some embodiments, in the model for including from structure to be searched according to the test result to model performance selection to A few available model, comprising:

The top n mould of best performance is selected from the model that structure to be searched includes according to the test result to model performance Type, wherein N is the positive integer more than or equal to 1.

In some embodiments, selective from the model that structure to be searched includes according to the test result to model performance The optimal top n model of energy, comprising:

Test the performance for each model that structure to be searched includes；

The preceding N of best performance is selected from whole models that structure to be searched includes according to the test result to model performance A model.

These embodiments select the top n model of best performance in such a way that exhaustive model is tested, can be from exhausted It is best performance to the model for ensuring to select in meaning.Also, due to the fast speed of test, usually far faster than trained Speed, therefore even if progress is exhaustive and feasible when model is more, these embodiments have practical value.

According to the test result to model performance, selected from the model that structure to be searched includes using heuristic search algorithm Select the top n model of best performance.

When heuristic search algorithm scans in state space, the position of each search is assessed, and from Some parts preferably position starts to continue searching, and continuous iteration is until reaching target.It can be omitted largely meaningless search in this way Rope path, significantly improves search efficiency.It may not be in absolute sense using N number of model that heuristic search algorithm searches out Best performance, but its performance is also good enough.Wherein, heuristic search algorithm include but is not limited to genetic algorithm, ant group algorithm, Simulated annealing, hill-climbing algorithm, particle swarm algorithm etc..

In some embodiments, the test result of model performance is selected from the model that structure to be searched includes in basis After at least one available model, method further include:

At least one available model is further trained using the data of goal task, according to further training As a result the model of best performance is selected.

The performance of available model can be advanced optimized using the data of goal task, and final choice is most suitable for executing The model of the goal task.

In some embodiments, structure to be searched is constructed, comprising:

At least one unit to be searched of building, unit to be searched include the oriented of multiple nodes and the multiple nodes of connection Side；

According at least one building unit structure to be searched to be searched, in building, every kind of unit to be searched can be replicated It is multiple.

When node, side are more, user directly designs entire structure to be searched may be relatively difficult, can pass through modularization Mode constructed.Unit to be searched is first constructed, then it is formed into structure to be searched by way of replicating, combining.This Sample, user can be absorbed in the design for treating search unit, reduce the difficulty of modelling.

In some embodiments, candidate operations include multiplied by 0 operation.

It is not connected with side between two nodes, a correspondence can also be equivalent to multiplied by the side of 0 operation, in this way convenient for system One is handled.

It in some embodiments, include the node with summation function in multiple nodes, the node energy with summation function It is enough to be added from the input data of different nodes to obtain the data that the node needs to cache.

When a node corresponds to multiple input nodes, which has the function of merging input data, fused data Operation can be summation, be averaging, ask product, splicing etc..Particularly, if the side of input node includes corresponding multiplied by 0 operation Side, then the node uses the node with summation function to be more suitable for because whether not influencing data summed result plus 0, It is equivalent to this side to not actually exist, this is consistent with the meaning multiplied by 0 operation.

Second aspect, the embodiment of the present invention provide a kind of image processing method using neural network model, neural network Model includes input layer, middle layer and output layer, and method includes:

Using the image training structure to be searched in training set, when each iteration in the training process, by every two phase Even retain the model obtained after a line in the side between node and be determined as the model to be trained of current iteration, if current iteration In the model to be trained comprising before the side of trained mistake in iteration, then trained housebroken side is corresponding Parameter be determined as initial parameter of the housebroken side in current iteration；

After structured training to be searched is good, according to the test result to model performance from the model that structure to be searched includes Select at least one available model, wherein the model that searching structure includes refers to will be in the side between every two connected node Retain the model obtained after a line；

Object module is determined according at least one available model；

Input picture is received using the input layer of object module, the figure of input picture is extracted using the middle layer of object module The processing result for input picture is exported as feature, and using the output layer of object module.

Object module used in above-mentioned image processing method is obtained by the pattern search method that first aspect provides , the efficiency of the model method search model is higher, and can cover biggish search range, therefore can search and be suitable for figure As the model of processing task, preferable processing result is obtained, while the efficiency of whole image treatment process can also be improved.

The third aspect, the embodiment of the present invention provides a kind of pattern search device, for searching for neural network model, comprising:

Module is constructed, for constructing structure to be searched, structure to be searched includes multiple nodes and the multiple nodes of connection Oriented side, node indicate unit data cached in neural network, the data warp of the start node caching when indicating this Cross the terminal node that the side is input to after candidate operations are handled, wherein be connected at least one between any two connected node Side, each edge at least one side correspond to different candidate operations；

Training module trains structure to be searched for the data in utilization training set, each iteration in the training process When, the model obtained after a line will be retained in the side between every two connected node and is determined as the mould to be trained of current iteration Type, if in the model to be trained of current iteration comprising before the side of trained mistake in iteration, by housebroken side pair The trained parameter answered is determined as initial parameter of the housebroken side in current iteration；

Selecting module, for after structured training to be searched is good, according to the test result to model performance from knot to be searched At least one available model is selected in the model that structure includes, wherein the model that searching structure includes, which refers to, is connected every two Retain the model obtained after a line in side between node.

In some embodiments, selecting module is specifically used for:

In some embodiments, selecting module includes:

Test cell, for testing the performance for each model that structure to be searched includes；

Selecting unit selects the test result of model performance for basis from whole models that structure to be searched includes The top n model of best performance.

In some embodiments, selecting unit is specifically used for:

In some embodiments, device is also:

Retraining module further trains at least one available model for the data using goal task, According to the model of further trained result selection best performance.

In some embodiments, building module is specifically used for:

In some embodiments, candidate operations include multiplied by 0 operation.

Fourth aspect, the embodiment of the present invention provide a kind of image processing apparatus using neural network model, neural network Model includes input layer, middle layer and output layer, and device includes:

Training module trains structure to be searched for the image in utilization training set, each iteration in the training process When, the model obtained after a line will be retained in the side between every two connected node and is determined as the mould to be trained of current iteration Type, if in the model to be trained of current iteration comprising before the side of trained mistake in iteration, by housebroken side pair The trained parameter answered is determined as initial parameter of the housebroken side in current iteration；

Selecting module, for after structured training to be searched is good, according to the test result to model performance from knot to be searched At least one available model is selected in the model that structure includes, wherein the model that searching structure includes, which refers to, is connected every two Retain the model obtained after a line in side between node；

Model determining module, for determining object module according at least one available model；

Execution module is mentioned for receiving input picture using the input layer of object module using the middle layer of object module The characteristics of image of input picture is taken, and exports the processing result for input picture using the output layer of object module.

5th aspect, the embodiment of the present invention provide a kind of computer readable storage medium, on computer readable storage medium Computer program instructions are stored with, when computer program instructions are read out by the processor and run, execute first aspect or first party The step of method that the possible implementation of any one of face provides.

6th aspect, the embodiment of the present invention provide a kind of electronic equipment, including memory and processor, the memory In be stored with computer program instructions, when the computer program instructions are read and are run by the processor, execute first party The step of method that the possible implementation of any one of face or first aspect provides.

To enable above-mentioned purpose of the invention, technical scheme and beneficial effects to be clearer and more comprehensible, special embodiment below, and Cooperate appended attached drawing, is described in detail below.

Detailed description of the invention

In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be to needed in the embodiment attached Figure is briefly described, it should be understood that the following drawings illustrates only certain embodiments of the present invention, therefore is not construed as pair The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this A little attached drawings obtain other relevant attached drawings.

Fig. 1 shows a kind of structural block diagram that can be applied to the electronic equipment in the embodiment of the present invention；

Fig. 2 shows the flow charts of pattern search method provided in an embodiment of the present invention；

Fig. 3 shows a kind of schematic diagram of structure to be searched provided in an embodiment of the present invention；

Fig. 4 shows the schematic diagram for three kinds of models for including in the structure to be searched of Fig. 3；

Fig. 5 shows a kind of building mode schematic diagram of structure to be searched provided in an embodiment of the present invention；

Fig. 6 shows a kind of functional block diagram of pattern search device provided in an embodiment of the present invention；

Fig. 7 shows the functional block diagram of another pattern search device provided in an embodiment of the present invention；

Fig. 8 shows a kind of function mould of image processing apparatus using neural network model provided in an embodiment of the present invention Block figure.

Specific embodiment

Below in conjunction with attached drawing in the embodiment of the present invention, technical solution in the embodiment of the present invention carries out clear, complete Ground description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Usually exist The component of the embodiment of the present invention described and illustrated in attached drawing can be arranged and be designed with a variety of different configurations herein.Cause This, is not intended to limit claimed invention to the detailed description of the embodiment of the present invention provided in the accompanying drawings below Range, but it is merely representative of selected embodiment of the invention.Based on the embodiment of the present invention, those skilled in the art are not doing Every other embodiment obtained under the premise of creative work out, shall fall within the protection scope of the present invention.

It should also be noted that similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi It is defined in a attached drawing, does not then need that it is further defined and explained in subsequent attached drawing.Meanwhile of the invention In description, term " first ", " second " etc. are only used for distinguishing one entity or operation from another entity or operation, It is not understood to indicate or imply relative importance, can not be understood as require that or imply and be deposited between these entities or operation In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that There is also other identical elements in process, method, article or equipment including the element.

Fig. 1 shows a kind of structural block diagram that can be applied to the electronic equipment in the embodiment of the present invention.Referring to Fig.1, electronics Equipment 100 includes one or more processors 102, one or more storage devices 104, input unit 106 and output device 108, these components pass through the interconnection of bindiny mechanism's (not shown) of bus system 112 and/or other forms.

Processor 102 can be central processing unit (CPU) or have data-handling capacity and/or instruction execution capability Other forms processing unit, and can control the other assemblies in electronic equipment 100 to execute desired function.

Storage device 104 can with various forms of computer readable storage mediums, such as volatile memory and/or it is non-easily The property lost memory.Volatile memory for example may include random access memory (RAM) and/or cache memory (cache) etc..Nonvolatile memory for example may include read-only memory (ROM), hard disk, flash memory etc..Computer-readable Can store one or more computer program instructions on storage medium, processor 102 can run computer program instructions, with Realize the method and/or other desired functions in the embodiment of the present invention.It in a computer-readable storage medium can be with Store various application programs and various data, such as application program use and/or the various data generated etc..

Input unit 106 can be the device that user is used to input instruction, and may include keyboard, mouse, microphone One or more of with touch screen etc..

Output device 108 can export various information (for example, image or sound) to external (for example, user), and can To include one or more of display, loudspeaker etc..

It is appreciated that structure shown in FIG. 1 is only to illustrate, electronic equipment 100 may also include it is more than shown in Fig. 1 or Less component, or with the configuration different from shown in Fig. 1.Each component shown in Fig. 1 can using hardware, software or its Combination is realized.In the embodiment of the present invention, electronic equipment 100 may be, but not limited to, desktop computer, laptop, intelligent hand The entity devices such as machine, intelligent wearable device, mobile unit can also be the virtual units such as virtual machine.

When pattern search method is set forth below, with what is run on the processor 102 of electronic equipment 100 the step of method It is illustrated for situation, but this is not to be construed as limiting the scope of the invention.

Fig. 2 shows the flow chart of pattern search method provided in an embodiment of the present invention, this method is for searching for nerve net The concrete type of network model, neural network is not construed as limiting, such as can be convolutional neural networks, Recognition with Recurrent Neural Network etc..It needs It points out, below, neural network model is also referred to as neural network or model sometimes.Referring to Fig. 2, this method comprises:

Step S20: structure to be searched is constructed.

It in embodiments of the present invention, is a digraph by any one Neural Networks Representation.The digraph includes multiple Node and the oriented side for connecting these nodes, and a line is only connected between any two connected node.Wherein, each Node indicates unit data cached in neural network, and by taking convolutional neural networks as an example, the input node of network can be indicated The unit of input picture is cached, the output contact of network can indicate the unit of caching output result, and the intermediate node of network can To indicate the unit of cache feature figure.Each edge connect two nodes, due to side be it is oriented, the two nodes might as well be distinguished Referred to as start node and terminal node, each edge correspond to a kind of operation, and the data of the start node caching of a line are right by its The terminal node on the side is input to after the operation processing answered, it is such to operate the volume that can be 1x1 by taking convolutional neural networks as an example Product, the convolution of 3x3, depth separate convolution, maximum pond, average pond etc..

Structure to be searched includes the oriented side of multiple nodes and the multiple nodes of connection, the definition on node and side and on Face is identical, but at least one side is connected between any two connected node of structure to be searched, every at least one side Side corresponds to different operations, referred to as candidate operations, means the operation that may include between the two connected nodes.To, Structure to be searched can accommodate a large amount of neural networks, and each neural network therein includes common node (knot i.e. to be searched Whole nodes of structure) and any two connected node of each neural network between be only connected with a line, it is possible to understand that, to Side in searching structure can be public for multiple neural networks.

Fig. 3 shows a kind of schematic diagram of structure to be searched provided in an embodiment of the present invention.Referring to Fig. 3, the knot to be searched Structure includes 4 nodes, is all connected with 3 sides between any two node.Generally, if structure to be searched includes n node, appoint Anticipating includes m side between two nodes, then the model sum for including in structure to be searched is m*m²* ... m^(n-1)=m⁽ⁿ ^{- 1) n/2}A, for Fig. 3, structure to be searched includes 3 altogether⁶A model, wherein the model comprising 3 kinds of structures shown in Fig. 4, this A little models node having the same, and can have public side, such as first (sequence from left to right) model With second model, the side between second node (sequence from top to bottom) and the 4th node is identical.

The range of search is determined on the collective entity of the neural network of representation to be searched, the model finally searched out It can generate from this collection.By taking above formula as an example, the quantity for the model for including in structure to be searched is in the increase of m and n Exponential growth, it is assumed that the quantity of m=10, n=10, model can achieve 10⁴⁵It is as many as a, it is seen then that structure energy to be searched Sizable search range is enough covered, to be conducive to search for more model structures, avoids omitting performance preferably model Structure.

In some implementations, structure to be searched can be designed by user, then the setting according to user of electronic equipment 100 It counts result and constructs the corresponding data structure of structure to be searched.For example, user writes the code comprising component structure to be searched, processing Device 102, which passes through, executes these code constructions structure to be searched, in another example, user writes the configuration file for describing structure to be searched, Processor 102 constructs structure to be searched by reading configuration file, in another example, node and side are being schemed in a manner of visual Shape is presented on interface, and user constructs the corresponding digraph of structure to be searched, processor 102 by operations such as dragging, duplication, stickups Actual structure to be searched, etc. is constructed according to the digraph.

In these implementations, does not need the too many energy of user effort and design structure to be searched, it is only necessary to determine section Connection relationship between the number and node of point, especially at the side of design node, it may not be necessary to which kind of be analyzed The corresponding operation in side is more preferable, it is only necessary to will likely the corresponding side of candidate operations be all added between node, in subsequent step Result can be searched out automatically in rapid.To which the design burden of user can mitigate significantly, in other words to the technical level requirement of user It can decline, it is not required that user has the professional knowledge of too many modelling.

It, can also be by electronic equipment 100 according to certain default rule Automated Designs and structure in other implementations Build out structure to be searched.

Step S21: the data training structure to be searched in training set is utilized.

The training of neural network is the process for needing a large amount of iteration, a batch in each iteration training set (batch) data are trained.In each iteration, in structure to be searched model is selected to be trained, the model is logical It crosses in the side between the every two connected node by structure to be searched after retaining a line and determines, wherein how to be connected from two A line is selected to be not construed as limiting in the side of node, for example, a line can be randomly choosed when each iteration, it can also each iteration When select a line different with last time, or select a line etc. according to certain default rule.For example, in Fig. 3 Structure to be searched, three kinds of models shown in Fig. 4 may be selected during repetitive exercise.

Before it has been noted that there are common sides between department pattern in structure to be searched, therefore institute when current iteration It certain sides in the model to be trained may trained mistake before.If before including in the model to be trained of current iteration The side of trained mistake in iteration, then by housebroken when corresponding trained parameter is determined as housebroken Initial parameter in current iteration.Wherein, the corresponding parameter of a line refers to the corresponding parameter of the corresponding operation in the side, example Such as, certain side indicates convolution operation, then its corresponding parameter includes the parameter in convolution kernel.I.e. by the trained parameter in these sides It migrates into the model to be trained of current iteration, was trained in iteration before being realized in other words by these common sides Parameter sharing between the model to be trained of model and current iteration.After parameter sharing, model convergence rate in training adds Fastly, be conducive to improve training effectiveness and improve training effect, and training process is in pattern search than relatively time-consuming part, if instruction Practice efficiency to be improved, then the efficiency of pattern search can also correspondingly improve, while the improvement of training effect is also beneficial to improve The result of pattern search.

In some implementations, a line will be retained at random in the side between the every two connected node of structure to be searched The model obtained afterwards is determined as the model to be trained of current iteration.The mode on random selection side implements very simple, and And when the number of iterations is enough, the frequency of training of each edge in structure to be searched is approximately uniform, so as to ensure every Side is all adequately trained and the corresponding parameter in each side is all adequately shared.

The meaning of parameter sharing is illustrated by taking Fig. 3 as an example, it is assumed that iteration 300,000 times, be selected every time due to each edge Probability is all 1/3, therefore each edge has been trained to 100,000 times (on probability meaning), any one model in structure to be searched is all Comprising 3 sides, 3 sides are trained to 10 Wan Cihou, are equivalent to the model and are individually had trained 100,000 times.

As a comparison, if individually training include in Fig. 33⁶A model, shared parameter, then will not make each model Each edge all train and need 3 altogether 100,000 times⁶Ten thousand training of X10, efficiency are lower.

When including compared with multinode in structure to be searched, the effect of parameter sharing is more prominent, continues to use and mentions in step S20 Example, if m=10, n=10, in structure to be searched the quantity of model be 10⁴⁵It is a, it is assumed that each Model Independent training, each Model training 100,000 times, iteration 10 is needed altogether⁴⁵X10 ten thousand times, the quantity is very big, and equipment common at present is all difficult to meet calculating Demand or even if can satisfy calculating demand, training used time are also difficult to receive.

And in pattern search method provided in an embodiment of the present invention, due to using parameter sharing, it need to only train 1,000,000 Secondary (m is multiplied by 100,000 times) can just be such that each edge is trained to 100,000 times, reach same training effect than independent training pattern Several fast orders of magnitude of mode, only the quantity on side between every two connected node is related for the number of iterations, with node number without It closes.

In addition, obtaining the model to be trained by the way of randomly choosing side, be conducive to generate on probability meaning more Different models are trained, to improve training effect.

Step S22: after structured training to be searched is good, include from structure to be searched according to the test result to model performance Model in select at least one available model.

Available model, which refers to, meets performance requirement, can be used for the model of goal task, and performance here can be made extensively Reason and good sense solution may also mean that the speed of service of the model for goal task either referring to precision of the model for goal task, Or including other meanings.Although actual task may only be needed using one of model, different model structures is still The thinking of modelling may be brought to researcher, so obtaining multiple available models has practical significance.

In some implementations, the model that N number of (N >=1) performance meets certain requirements if desired is searched out, can use Data in test set test the performance of the model in structure to be searched one by one, test out N number of model for meeting performance requirement just Not re-test.

In other implementations, need according to the model for including from structure to be searched to the test result of model performance Result of the middle top n model for selecting best performance as search.At this point, can at least take following two scheme:

First, the performance for each model that structure to be searched includes is tested first, then according to the test to model performance As a result the top n model of best performance is selected from whole models that structure to be searched includes.

The program is actually that the model treated in searching structure carries out exhaustion and is tested for the property one by one, so as to The model for ensuring to select from absolute sense is best performance.Also, due to the fast speed of test, usually far faster than instruction Experienced speed, such as the test for only needing the several seconds that a model can be completed, therefore even if exhaustion is carried out when model is more is also It is feasible, the practicability of scheme will not be reduced.

Second, according to the test result to model performance, the mould that includes from structure to be searched using heuristic search algorithm The top n model of best performance is selected in type.

When heuristic search algorithm scans in state space, the position of each search is assessed, and from Some parts preferably position starts to continue searching, and continuous iteration is until reaching target.It can be omitted largely meaningless search in this way Rope path, significantly improves search efficiency.It may not be in absolute sense using N number of model that heuristic search algorithm searches out Best performance (compared to the mode of exhaustion), but its performance is also good enough.The heuristic search algorithm packet of comparative maturity at present Include genetic algorithm, ant group algorithm, simulated annealing, hill-climbing algorithm, particle swarm algorithm etc..

It is introduced by taking Fig. 3 as an example below how using genetic algorithm progress pattern search, it is assumed that search out best performance Preceding 10 models.Three sides between every two node are indicated with a, b, c respectively, two nodes are a pair, totally 6 pairs of nodes, Number is 1-6.Any one model in structure so to be searched can use a coded representation, such as (1a, 2b, 3a, 4b, 5b, 6c).For the state space that coding is formed, a kind of possible workflow of genetic algorithm is as follows:

Step 1: 20 groups of codings, i.e. 20 models are randomly generated, these models are tested, and is chosen according to test result 10 groups of codings of best performance are selected, and are stored；

Step 2: being changed based on 10 groups of codings of storage, generate other 20 groups of codings, there are many modes of variation, citing It is as follows:

A. intersect: it is newly encoded to intersect generation one that 2 groups of codings are randomly choosed from 10 groups of codings, such as selection (1a, 2a, 3a, 4c, 5c, 6c) and (1b, 2b, 3a, 4a, 5c, 6b), each newly encoded is all at random from this two groups of codings, such as (1a, 2b, 3a, 4c, 5c, 6b).Repeat available 10 groups of this step 10 time it is newly encoded.

B. make a variation: it is newly encoded come the generation one that makes a variation that 1 group of coding is randomly choosed from 10 groups of codings, such as selection (1a, 2a, 3a, 4c, 5c, 6c), this coding random selection is made a variation, first 1a is such as become into 1c, can be obtained newly encoded (1c, 2a, 3a, 4c, 5c, 6c).Available 10 groups of repetition this step 10 time newly encoded.

Step 3: test intersects and the performance for the newly generated 20 groups of codings that make a variation, and the 10 groups of codings stored before combining Performance therefrom reselects out 10 groups of codings of best performance, stores again.

Step 4: repeating step 2,3 totally 10 times, retain the corresponding model structure of 10 groups of codings of final storage as calculating Preceding 10 models for the best performance that method searches out.

This algorithm tests 20 models every time, is repeated 10 times, then finally only tests 200 models, far smaller than to Include in searching structure 3⁶A model.But the new construction that above-mentioned iterative algorithm generates every time is all based on current preferably structure It is changed, therefore passes through an iteration, the performance of 10 models of storage understands constantly past performance more preferably direction change, therefore Greatly reduce the test of the poor model of many performances.For example, the corresponding candidate operations performance of a is worst in three sides a, b, c, that During iteration, the meeting in new 20 groups of codings of generation comprising side a is fewer and fewer, can more comprising side b and The time of the poor structure of these performances of test comprising side a is thus omitted in side c.

For other heuristic search algorithms, the principle of search with genetic algorithm be it is similar, no longer specifically explain here It states.

In addition, in the prior art, after training model, usually also to carry out certain fine tuning on training set can For goal task, avoid training insufficient situation.And in embodiments of the present invention, if abundant in training structure to be searched It has trained on each edge (for example, by way of the random selection side mentioned before), the available model selected in step S22 It is used directly for goal task, no longer needs to the fine tuning for carrying out model parameter, is conducive to the search efficiency for improving model.

In some embodiments, after obtaining at least one available model, the data pair of goal task can also be utilized At least one available model is further trained, according to the model of further trained result selection best performance.For example, At least one available model is the top n model of the best performance searched out from structure to be searched, further after training, Performance ranking may change, and a model of final only selection best performance is used for goal task, which is basis What training data determined is most suitable for executing the model of the goal task.Wherein, the data in training set can be goal task The subset of data.

In some embodiments, the corresponding candidate operations in the side of structure to be searched include multiplied by 0 operation, such Bian Buchuan Delivery data, to be equivalent to be practically without connection side between two nodes.It can also be regarded between two nodes without connection side For a kind of operation with the pars such as convolution, pond, before scanning for, it is likely that be difficult to determine between two nodes whether A line is connected, therefore, a correspondence can be added between the two nodes and corresponded to multiplied by the side of 0 operation and other operations Side it is arranged side by side, expression will be not connected as a kind of optional mode.It introduces multiplied by behind the corresponding side of 0 operation, is conducive to expand mould Different model structures is uniformly processed in the search range of type, while can also further decrease user and design structure to be searched Difficulty.

In some embodiments, include the node with summation function in multiple nodes of structure to be searched, there is summation The node of function can will be added from the input data of different nodes to obtain the data that the node needs to cache.

When a node corresponds to multiple input nodes, which has the function of merging input data, fused data Operation can be summation, be averaging, ask product, splicing etc..Particularly, if the side of input node includes corresponding multiplied by 0 operation Side, then the node uses the node with summation function to be more suitable for because whether not influencing data summed result plus 0, It is equivalent to this side to not actually exist, this is consistent with the meaning multiplied by 0 operation.It in other cases, can be according to reality The border demand node different using function.Referring to Fig. 3, it is assumed that first node in Fig. 3 to the side of third node and first A node to the 4th node include among it is corresponding multiplied by 0 operation while, then third node and the 4th node can To be shown in Fig. 3 with plus sige using the node with summation function.

When node, side are more, user directly designs entire structure to be searched may be relatively difficult, in some embodiments In, structure to be searched can be constructed by modular mode.

Can specifically following way be used:

At least one unit to be searched of building first, wherein every kind of unit to be searched has different structures.List to be searched It is first with structure to be searched be in composition it is similar, equally include the oriented side of multiple nodes and the multiple nodes of connection, example Such as, to construct a bigger structure to be searched, the structure to be searched in Fig. 3 can also be used as a kind of unit to be searched.

Then according at least one building unit structure to be searched to be searched.When building, every kind of unit to be searched can be by Copy as multiple, while different types of unit to be searched can be combined, and ultimately form structure to be searched.For example, in Fig. 5 In, it is combined with three kinds of units to be searched altogether, is unit A, unit B to be searched and unit C to be searched to be searched respectively, wherein Unit A to be searched has been replicated 3 times, and unit B to be searched has been replicated 2 times, is connected between each unit to be searched by a line It connects, the function on the side only simply can transmit data, and connection type is as shown in Figure 5.It should be understood that in addition to be searched It can also include being not belonging to the node of some unit to be searched, such as the input node in Fig. 5 in structure to be searched except unit And output node.

To which user can be absorbed in the lesser unit to be searched of design, favorably when designing biggish structure to be searched In the design difficulty for reducing model.It, can structure in a relatively short period of time by modular mode after designing unit to be searched Build out more complicated structure to be searched.

The embodiment of the present invention also provides a kind of image processing method using neural network model.Mind used in this method It include input layer, middle layer and output layer, the above-mentioned three layers universal architecture for current neural network model, tool through network model Body meaning does not explain in detail.Image processing method specifically comprises the following steps:

Step a: constructing structure to be searched, and structure to be searched includes the oriented of multiple nodes and the multiple nodes of connection Side, node indicate unit data cached in neural network, and the data of the start node caching when indicating this are by candidate The terminal node on the side is input to after operation processing, wherein at least one side is connected between any two connected node, at least Each edge in a line corresponds to different candidate operations.

Step b:, will be every when each iteration in the training process using the image training structure to be searched in training set Retain the model obtained after a line in side between two connected nodes and is determined as the model to be trained of current iteration, if this In the model to be trained of secondary iteration comprising before the side of trained mistake in iteration, then it is housebroken side is corresponding Trained parameter is determined as initial parameter of the housebroken side in current iteration.

Step c: after structured training to be searched is good, include from structure to be searched according to the test result to model performance At least one available model is selected in model, wherein the model that searching structure includes refers to will be between every two connected node Side in retain the model that obtains after a line.

Above-mentioned steps a to step c with step S20 to step S22 be it is similar, trained data are only limited to figure Picture is no longer specifically described.

Step d: object module is determined according at least one available model.

Object module refers to that specific image processing tasks institute neural network model to be used, such image procossing are appointed Business may be, but not limited to, the tasks such as image classification, target detection, image segmentation, image recognition.

It determines that the mode of object module is not construed as limiting according at least one available model, such as can choose best performance , it is most fast to can choose the speed of service, and it is simplest, etc. to can choose model structure.Particularly, since image procossing is appointed Business only uses a model, therefore an available model can be only selected in step c, and directly this can be used in step d Model as object module.

In some implementations, object module is not necessarily intended to directly select from least one available model, but First available model can be handled, then from selection target model in treated model.For example, can use at image The data of reason task further train at least one available model, most according to further trained result selection performance Excellent model is as object module.

Step e: receiving input picture using the input layer of object module, extracts input figure using the middle layer of object module The characteristics of image of picture, and the processing result for input picture is exported using the output layer of object module.

Processing method in step e is the usually used method of current neural network model, and detailed process is not made in detail It explains.

Object module used in above-mentioned image processing method is to utilize pattern search method provided in an embodiment of the present invention It obtains, according to elaboration before it is found that the efficiency of the model method search model is higher, and can cover biggish search model It encloses, therefore the model suitable for image processing tasks can be searched, obtain preferable processing result, while can also improve entire The efficiency of image processing process.

The embodiment of the present invention also provides a kind of pattern search device 300, and the device is for searching for neural network model.Reference Fig. 6, the device include: building module 310, training module 320 and selecting module 330.

Wherein, building module 310 includes that multiple nodes and connection are multiple for constructing structure to be searched, structure to be searched The oriented side of node, node indicate unit data cached in neural network, the start node caching when indicating this Data are input to the terminal node on the side after candidate operations are handled, wherein be connected between any two connected node to Lack a line, each edge at least one side corresponds to different candidate operations；

Training module 320 is used for using the data training structure to be searched in training set, in the training process every time repeatedly Dai Shi will retain the model obtained after a line and be determined as what current iteration to be trained in the side between every two connected node Model, if in the model to be trained of current iteration comprising before the side of trained mistake in iteration, by housebroken side Corresponding trained parameter is determined as initial parameter of the housebroken side in current iteration；

Selecting module 330 is used for after structured training to be searched is good, according to the test result to model performance to be searched At least one available model is selected in the model that structure includes, wherein the model that searching structure includes refers to every two phase Even retain the model obtained after a line in the side between node.

In some implementations of the device, the model to be trained of current iteration is will be between every two connected node Side in retain the model obtained after a line at random.

In some implementations of the device, selecting module 330 is specifically used for:

Referring to Fig. 7, in some implementations of the device, selecting module 330 includes: test cell 331 and selection Unit 332.

Wherein, test cell 331 is used to test the performance for each model that structure to be searched includes；

Selecting unit 332 is used for basis and is selected from whole models that structure to be searched includes the test result of model performance Select the top n model of best performance.

In some implementations of the device, selecting unit 332 is specifically used for:

With continued reference to Fig. 7, in some implementations of the device, device further include: retraining module 340, retraining Module 340 is for further training at least one available model using the data of goal task, according to further instruction The model of experienced result selection best performance.

In some implementations of the device, building module 330 is specifically used for:

In some implementations of the device, candidate operations include multiplied by 0 operation.

Include the node with summation function in multiple nodes in some implementations of the device, there is summation function The node of energy can will be added to obtain the data that the node needs to cache from the input data of different nodes.

The technical effect of pattern search device 300 provided in an embodiment of the present invention, realization principle and generation is in aforementioned side By the agency of in method embodiment, to briefly describe, Installation practice part does not refer to that place, the method for can refer to are applied in corresponding in example Hold.

The embodiment of the present invention also provides a kind of image processing apparatus 400 using neural network model.Wherein, the mind of use It include input layer, middle layer and output layer through network model.Referring to Fig. 8, which includes: building module 410, training module 420 and selecting module 430, model determining module 440 and execution module 450.

Module 410 is constructed, for constructing structure to be searched, structure to be searched includes multiple nodes and the multiple nodes of connection Oriented side, node indicates data cached unit in neural network, the data of the start node caching when indicating this The terminal node on the side is input to after candidate operations are handled, wherein at least one is connected between any two connected node Side, each edge at least one side correspond to different candidate operations；

Training module 420, for using the image training structure to be searched in training set, in the training process every time repeatedly Dai Shi will retain the model obtained after a line and be determined as what current iteration to be trained in the side between every two connected node Model, if in the model to be trained of current iteration comprising before the side of trained mistake in iteration, by housebroken side Corresponding trained parameter is determined as initial parameter of the housebroken side in current iteration；

Selecting module 430, for after structured training to be searched is good, according to the test result to model performance to be searched At least one available model is selected in the model that structure includes, wherein the model that searching structure includes refers to every two phase Even retain the model obtained after a line in the side between node；

Model determining module 440, for determining object module according at least one available model；

Execution module 450 utilizes the middle layer of object module for receiving input picture using the input layer of object module The characteristics of image of input picture is extracted, and exports the processing result for input picture using the output layer of object module.

Image processing apparatus 400 provided in an embodiment of the present invention using neural network model, realization principle and generation Technical effect in preceding method embodiment by the agency of, to briefly describe, Installation practice part does not refer to place, can join Test method applies corresponding contents in example.

The embodiment of the present invention also provides a kind of computer readable storage medium, and meter is stored on computer readable storage medium Calculation machine program instruction when computer program instructions are read out by the processor and run, executes provided in an embodiment of the present invention rolled up The step of method of product neural network.This computer readable storage medium may be, but not limited to, storage device shown in fig. 1 104。

The embodiment of the present invention also provides a kind of electronic equipment, including memory and processor, is stored with meter in memory Calculation machine program instruction when computer program instructions are read out by the processor and run, executes provided in an embodiment of the present invention rolled up The step of method of product neural network.The electronic equipment may be, but not limited to, electronic equipment 100 shown in fig. 1.

It should be noted that all the embodiments in this specification are described in a progressive manner, each embodiment weight Point explanation is the difference from other embodiments, and the same or similar parts between the embodiments can be referred to each other. For device class embodiment, since it is basically similar to the method embodiment, so being described relatively simple, related place ginseng See the part explanation of embodiment of the method.

In several embodiments provided herein, it should be understood that disclosed device and method can also pass through it His mode is realized.The apparatus embodiments described above are merely exemplary, for example, the flow chart and block diagram in attached drawing are aobvious The device of multiple embodiments according to the present invention, architectural framework in the cards, the function of method and computer program product are shown It can and operate.In this regard, each box in flowchart or block diagram can represent one of a module, section or code Point, a part of the module, section or code includes one or more for implementing the specified logical function executable Instruction.It should also be noted that function marked in the box can also be attached to be different from some implementations as replacement The sequence marked in figure occurs.For example, two continuous boxes can actually be basically executed in parallel, they sometimes may be used To execute in the opposite order, this depends on the function involved.It is also noted that each of block diagram and or flow chart The combination of box in box and block diagram and or flow chart can be based on the defined function of execution or the dedicated of movement The system of hardware is realized, or can be realized using a combination of dedicated hardware and computer instructions.

In addition, each functional module in each embodiment of the present invention can integrate one independent portion of formation together Point, it is also possible to modules individualism, an independent part can also be integrated to form with two or more modules.

It, can be with if the function is realized and when sold or used as an independent product in the form of software function module It is stored in computer-readable storage medium.Based on this understanding, technical solution of the present invention is substantially in other words to existing Having the part for the part or the technical solution that technology contributes can be embodied in the form of software products, the computer Software product is stored in a storage medium, including some instructions are used so that computer equipment executes each embodiment institute of the present invention State all or part of the steps of method.Computer equipment above-mentioned includes: personal computer, server, mobile device, intelligently wears The various equipment with execution program code ability such as equipment, the network equipment, virtual unit are worn, storage medium above-mentioned includes: U Disk, mobile hard disk, read-only memory, random access memory, magnetic disk, tape or CD etc. are various to can store program code Medium.

The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, and should all contain Lid is within protection scope of the present invention.Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. a model search method, is characterized in that, for searching neural network model, comprises:

Build a structure to be searched, the structure to be searched includes a plurality of nodes and a directed edge connecting the plurality of nodes, the node represents a unit that caches data in the neural network, and the edge represents the starting node of the edge The cached data is processed by the candidate operation and then input to the terminal node of the edge, wherein at least one edge is connected between any two connected nodes, and each edge in the at least one edge corresponds to a different candidate operation;

Using the data in the training set to train the structure to be searched, at each iteration in the training process, the model obtained after retaining one edge between each two connected nodes is determined as the model to be trained in this iteration, If the model to be trained in this iteration includes edges that have been trained in previous iterations, the trained parameters corresponding to the trained edges are determined as the parameters of the trained edges at this iteration. initial parameters;

After the structure to be searched is trained, at least one available model is selected from the models contained in the structure to be searched according to the test result of the model performance, wherein the model contained in the structure to be searched refers to connecting every two The model obtained by keeping one of the edges between nodes.

2 . The model search method according to claim 1 , wherein the model to be trained in the current iteration is a model obtained by randomly retaining one edge among the edges between every two connected nodes. 3 .

3. The model search method according to claim 1, wherein the selection of at least one available model from the models contained in the structure to be searched according to the test result of the model performance, comprises:

According to the test result of the model performance, the top N models with the best performance are selected from the models included in the structure to be searched, where N is a positive integer greater than or equal to 1.

4. The model search method according to claim 3, wherein the selection of the top N models with optimal performance from the models included in the structure to be searched according to the test result of the model performance, comprises:

testing the performance of each model included in the structure to be searched;

The top N models with the best performance are selected from all the models included in the structure to be searched according to the test result of the model performance.

5. The model search method according to claim 3, wherein the selection of the top N models with optimal performance from the models included in the structure to be searched according to the test result of the model performance, comprises:

According to the test result of the model performance, a heuristic search algorithm is used to select the top N models with the best performance from the models included in the structure to be searched.

6 . The model search method according to claim 1 , wherein after selecting at least one available model from the models included in the structure to be searched according to the test result of the model performance , the method also includes:

The at least one available model is further trained using the data of the target task, and a model with the best performance is selected according to the result of the further training.

7. The model search method according to claim 1, wherein the constructing the structure to be searched comprises:

constructing at least one unit to be searched, the unit to be searched includes a plurality of nodes and directed edges connecting the plurality of nodes;

The to-be-searched structure is constructed according to the at least one type of to-be-searched unit, and each type of to-be-searched unit may be copied into multiples during construction.

8. The model search method according to claim 1, wherein the candidate operation comprises a multiplying by 0 operation.

9. The model search method according to claim 1, wherein the plurality of nodes includes a node with a summation function, and the node with a summation function can add input data from different nodes to obtain This node needs cached data.

10. An image processing method using a neural network model, wherein the neural network model comprises an input layer, an intermediate layer and an output layer, the method comprising:

Using the images in the training set to train the structure to be searched, at each iteration in the training process, the model obtained after retaining one edge between each two connected nodes is determined as the model to be trained in this iteration, If the model to be trained in this iteration includes edges that have been trained in previous iterations, the trained parameters corresponding to the trained edges are determined as the parameters of the trained edges at this iteration. initial parameters;

After the structure to be searched is trained, at least one available model is selected from the models contained in the structure to be searched according to the test result of the model performance, wherein the model contained in the structure to be searched refers to connecting every two The model obtained after retaining one edge among the edges between nodes;

determining a target model from the at least one available model;

The input image is received by the input layer of the target model, the image features of the input image are extracted by the intermediate layer of the target model, and the processing result of the input image is output by the output layer of the target model.

11. A model search device, characterized in that, for searching a neural network model, comprising:

The building module is used to construct a structure to be searched, the structure to be searched includes a plurality of nodes and directed edges connecting the plurality of nodes, the nodes represent the units that cache data in the neural network, and the edges represent the The data cached by the starting node of the edge is processed by the candidate operation and then input to the terminal node of the edge, wherein at least one edge is connected between any two connected nodes, and each edge in the at least one edge corresponds to a different Describe the candidate operation;

The training module is used to train the structure to be searched by using the data in the training set, and at each iteration in the training process, the model obtained after retaining one edge in the edges between every two connected nodes is determined as this iteration The model to be trained, if the model to be trained in this iteration includes the edge that has been trained in the previous iteration, then the trained parameter corresponding to the trained edge is determined as the trained edge is in. The initial parameters of this iteration;

The selection module is used to select at least one available model from the models contained in the structure to be searched according to the test result of the model performance after the structure to be searched is trained, wherein the model contained in the structure to be searched refers to The model obtained by keeping one edge between every two connected nodes.

12. An image processing device using a neural network model, wherein the neural network model comprises an input layer, an intermediate layer and an output layer, and the device comprises:

The training module is used to train the structure to be searched by using the images in the training set. At each iteration in the training process, the model obtained after retaining one edge in the edges between each two connected nodes is determined as this iteration The model to be trained, if the model to be trained in this iteration includes the edge that has been trained in the previous iteration, then the trained parameter corresponding to the trained edge is determined as the trained edge is in. The initial parameters of this iteration;

The selection module is used to select at least one available model from the models contained in the structure to be searched according to the test result of the model performance after the structure to be searched is trained, wherein the model contained in the structure to be searched refers to The model obtained by keeping one edge between every two connected nodes;

a model determination module for determining a target model according to the at least one available model;

The execution module is configured to receive the input image by using the input layer of the target model, extract the image features of the input image by using the middle layer of the target model, and output the input image by using the output layer of the target model. process result.

13. A computer-readable storage medium, wherein computer program instructions are stored on the computer-readable storage medium, and when the computer program instructions are read and executed by a processor, any one of claims 1-10 is executed. A step of the method.

14. An electronic device comprising a memory and a processor, wherein computer program instructions are stored in the memory, characterized in that, when the computer program instructions are read and executed by the processor, the computer program instructions of claims 1-10 are executed. The steps of any one of the methods.