WO1992017849A1 - Conception automatique de processeurs de signaux a l'aide de reseaux neuronaux - Google Patents
Conception automatique de processeurs de signaux a l'aide de reseaux neuronaux Download PDFInfo
- Publication number
- WO1992017849A1 WO1992017849A1 PCT/US1992/002796 US9202796W WO9217849A1 WO 1992017849 A1 WO1992017849 A1 WO 1992017849A1 US 9202796 W US9202796 W US 9202796W WO 9217849 A1 WO9217849 A1 WO 9217849A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- training
- gain
- node
- neural network
- learning rate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0499—Feedforward networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
Definitions
- the invention relates to neural networks and methods of training neural networks.
- An important problem in signal processing is the capability to discriminate between signals originating from many different measurements. The discrimination is arbitrary in that any difference in the target or environment can serve as the basis of separation.
- the signal processing system must evaluate an available set of measurements and determine if the signals are separable at the operating signal-to-noise ratio (SNR) .
- SNR operating signal-to-noise ratio
- a filter or transform is applied to the raw signal to obtain a representation that contains an easily separable set of features.
- the difficulty lies in designing a transformation that maps the raw signal into a more easily separable representation. In some cases, it is clear that a certain filtering operation is appropriate, though in general, the selection of a filter that maps the raw signal to a salient data representation is a heuristic process.
- the signal processor would "learn" the mapping required to perform signal discrimination from a known set of separable measurements.
- the relevant information is known to be in the frequency domain, and the separability can be enhanced by applying an FFT on the signal.
- the appropriate mapping is unknown and some heuristic - 2 - application of known transformations in the literature must be attempted.
- Neural networks offer the promise of a completely data-driven processor that automatically learns the required mapping by example.
- the neural network based processor can simply be retrained to accommodate changes in the measurement and discrimination.
- the neural network approach has a loose correspondence to biological nervous systems where each neuron receives input from potentially thousands of other neurons to form a nonlinear interconnected network. It is believed that these biological networks are capable of learning highly complex mappings.
- a possible mechanism for learning in neural systems was proposed by Hebb (see, D. o. Hebb, "The Organization of Behavior,” New York, N.Y., John Wiley, 1949) and this led to an interest in computer modeling of networks of "neuron-like" elements.
- a training algorithm for multi-layer networks was then developed that allowed any desired mapping to be approximated given enough nodes in the network.
- This approach was later redeveloped as the BEP training algorithm and applied to many different problems in the framework of parallel distributed processing (see, D.E. Rumelhart, G.E. Hinton and R.J. Williams, "Learning Internal Representations by Error Propogation,” in D.E. Rumelhart and J.L. McClelland (eds.), Parallel Distributed Processing; Explorations in the Microstructure of Cognition. Vol. 1, Cambridge, MA, MIT Press (1986) ) .
- a drawback with the BEP approach is that convergence is slow due to local minima problems and it requires thousands of presentations of the training set for large dimensional problems.
- the invention relates to a neural network based signal recognition system that "learns" an appropriate transform for a given signal type and application.
- the recognition system is based on a multi-layer Perceptron neural network trained with a highly efficient deterministic annealing algorithm that can be two to three orders of magnitude faster than the commonly used Backward Error Propagation (BEP) technique.
- BEP Backward Error Propagation
- the training algorithm is less susceptible to local minima problems.
- the system is data driven in the sense that nodes are added until a specified level of performance is achieved, thereby making most efficient use of the available processing resources.
- the invention features a method of training a neural network having an output layer and at least one middle layer including one or more internal nodes each of which is characterized by a node activation function having a gain.
- the method includes the steps of setting the gain on at least some of the internal nodes equal to an initial gain value; training the multi-layer perceptron starting with the initial gain value; and changing the gain on at least some of the internal nodes during training, the gain change being in a direction which increases sensitivity of the multi ⁇ layer perceptron.
- Preferred embodiments include the following features.
- the neural network is a fully connected, multi-perceptron neural network which includes no more than one middle layer that has but a single node.
- the training employs a gradient descent training procedure, in particular, a back error propagation training procedure.
- the training is characterized by a learning rate and the method also includes the step of decreasing that learning rate during training while also changing the gain.
- the method further includes the step of setting the gain of each output node to a fixed value before beginning any training.
- the internal nodes are each characterized by a sigmoid-like activation function which has the following form:
- the method further includes the step of computing an error for the neural network after the gain has reached a final gain, the error indicating how well the neural network has been trained. Also, the method includes the further steps of adding an additional node to one of the internal layer if the error exceeds a predetermined threshold; and after adding the additional node, retraining the neural network.
- the training is supervised training using a training set made up of members for which corresponding desired outputs are known, and the said error is a measure of how far the desired outputs for the members of the training set are from actual outputs generated by applying the members of the training set to the neural network.
- the error is computed in accordance with the following equation:
- E ⁇ p ⁇ j I o*. - d P . I , where p is an index identifying a member of the training set; j is an index identifying an output node; ⁇ P . is an actual output of output node j for the member of the training set; and d p . is a desired output of output node j for the p* member of the training set.
- the method further includes the steps of determining whether the training is converging; and if it is determined that the training is not converging, modifying the training by increasing the training rate so as to cause an instability in training to occur.
- the method also includes the step of resuming training at a reduced training rate after training for a preselected period of time with the increased learning rate.
- the invention features an apparatus for training a neural network having an output layer and at least one middle layer which includes one or more internal nodes each of which is characterized by a node activation function having a gain.
- the apparatus includes means for setting the gain on at least some of the internal nodes equal to an initial gain value; means for training the multi-layer perceptron starting with the initial gain value; and means for changing the gain on at least some of the internal nodes during training, the gain change being in a direction which increases sensitivity of the multi ⁇ layer perceptron.
- the internal nodes are each characterized by a sig oid-like activation function of the following form:
- the apparatus also includes means for computing an error for the neural network after the gain has reached a final gain, the error indicating how well the neural network has been trained. It further includes means for adding an additional node to one of the internal layers if the error exceeds a predetermined threshold; and means for causing the training' eans to retrain the neural network after the additional node has been added.
- One advantage of the invention is that it can find a solution for architectures which appear to be insufficient based upon previous training techniques. For many problems involving real sensor signals, the invention arrives at architectures requiring less than 10 to 15 internal (“hidden") nodes to achieve the desired signal discrimination. In addition, the invention enables one to train a neural network on a very limited part of data and still achieve good generalization to the remainder of the data in the set of data. Moreover, the performance of the training algorithm is not particularly dependent on the order in which the neural network is trained. Other advantages and features will become apparent from the following description of the preferred embodiment and from the claims.
- Figs. 2a and 2b present a flow chart of the gradient descent gain annealing (GDGA) algorithm for training a multi-layer perceptron;
- GDGA gradient descent gain annealing
- Fig. 3 shows the testing performance of a MLP network as a function of the percent of the data set used for training
- Fig. 4 shows the training performance of a single hidden node MLP network as a function of the percent of the data set used for training; and Fig. 5 is a comparison of the average testing performance of a MLP network trained on 1% of the data set.
- a multi-layer perceptron (MLP) neural network 10 is made up of an input layer 12 of input nodes 14 followed by an internal "hidden" layer 16 of internal nodes 18 that are connected to an output layer 20 of output nodes 22.
- MLP network 10 is a fully interconnected MLP network operating in a feed-forward mode.
- each node in a given layer is connected to every node in the next higher layer, and conversely, every node at any level above input layer 12 receives input from every node on the next lower level.
- the node labelled "A”, i.e., a representative node 14 of input layer 12, is connected to every node 18 in the next higher, internal layer 16 and the node labelled "B”, i.e., a representative node 18 of internal layer 16, is connected to every node 14 in the input layer 12.
- the depicted MLP network has only a single hidden layer 16, it could have more than one hidden layer depending upon the complexity and type of problem being modeled.
- the number of input nodes 14 which are actually used depends on the dimensionality of the signal which will be fed into MLP network 10. For example, if the input signal is an M-point FFT, it may be necessary to use M input nodes.
- Each node in MLP network 10 is characterized by a particular node activation function f(x) and an offset ⁇ and each connection between node j in one layer and node k in the next lower level is characterized by a weight, w- ..
- the activation functions for internal nodes 18 and output nodes 22 are sigmoid functions having the following form:
- the output for node j on level 1 is as follows:
- O j Cl) f[( ⁇ k O k (l-l).w kj ) + ⁇ ..(l)].
- Eq. 5 In general, a modified BEP training procedure is used to train the MLP network 10. It is modified by starting the system at a small magnitude for the gain and annealing the system to a large gain. At each gain value the BEP algorithm is run to convergence. The gain is a variable which has the characteristic that changing it deforms the energy surface with respect to the other free parameters (i.e., the weights and offsets). At low gain values, the MLP network has a nearly flat energy landscape, and the search covers a large portion of the parameter space.
- GDGA Gradient Descent Gain Annealing
- the dynamic architecture MLP network has the ability to grow to accommodate the complexity of the problem and make efficient use of the available resources. The GDGA training procedure will now be described in greater detail.
- the MLP network includes only a single hidden layer. However, it should be understood that the procedure applies to other MLP networks and other architectures, including those with multiple hidden layers.
- the steps of a GDGA training algorithm 100 are presented in Figs. 2a-b.
- Training algorithm 100 which implements a supervised training schedule, begins with the selection of a set of input signals for which the desired outputs are known, i.e., a training set (step 102).
- MLP network 10 is initialized, which involves setting the gain of all of the output nodes to a low fixed quantity, e.g. -2 (step 104) .
- This approach as compared to other prior art approaches to this problem, has the advantage of proceeding from simpler architectures to more complex ones based upon the demands of the problem rather than by starting with an architecture which is unnecessarily complex for the problem at hand and then trying to pare away unneeded nodes.
- the weights and offsets of all of the internal and output nodes 18 and 22 and connections are set to some small random values (step 108) .
- the values for the weights and offsets are selected by using the following algorithm: 2.0 * RANDOM - 1.0, where RANDOM is a random number generating function which produces a number between 0 and 1.
- the output of RANDOM is scaled and shifted so as to yield a distribution of randomly generated numbers centered on zero, which thus introduces no bias into the initialization of MLP network 10.
- algorithm 100 defines the ranges over which the gain ⁇ of internal nodes 18 and the learning rate for the subsequent training will be permitted to vary during the gain annealing process (step 110) . It then initializes the gain and learning rate to the initial values (step 112) .
- ⁇ Ln t the initial gain value
- 0 final/ the final gain value is set to -10.0
- 7init/ the initial learning rate is set to 0.03
- K final the final learning rate
- Algorithm 100 also initializes an energy variable E k equal to some large number.
- E k serves to keep track of the minimum energy which is achieved during the training procedure. Setting E. assures that the first computed energy for MLP network 10 will be smaller than the initial value of E k .
- E old is a measure of the total error between all of the training set signals and the desired outputs for those signals (step 114) .
- the expression for computing E Qld is as follows:
- O 9 . is the actual output of output node j for the p th signal of the training set; and d p . is the desired output of output node j for the p th signal of the training set.
- algorithm 100 After computing E ol(J , algorithm 100 begins training MLP network 10 to adjust the weights and offsets using a back error propagation procedure (BEP) such as is well known to those skilled in the art (step 118) .
- BEP back error propagation procedure
- the BEP training procedure is run in the mode in which the weights and offsets are adjusted for each signal pattern of the training set rather that for the entire training set at once.
- one iteration of the BEP training procedure consists of a separate training for each of the members of the training set.
- the BEP training continues through multiple iterations until either the desired convergence is achieved or the number of iterations exceeds some threshold amount, indicating that the procedure is not converging.
- Algorithm 100 keeps track of the number of iterations which are performed for a given gain and learning rate to determine whether the training procedure becomes stuck and fails to converge.
- algorithm 100 computes E n ⁇ w , the energy for MLP network 10 resulting from that iteration of training (step 120) .
- E a ⁇ w is then compared to E k (step 122) . If it is smaller than ' E k , the value of E fe is set equal to E n ⁇ w and the weights, offset and gain for that new minimum are saved (step 124) .
- algorithm 100 determines whether the number of iterations which have been performed during this loop of the BEP training procedure has exceeded 50 (step 126) . During -the initial iterations of the training, algorithm 100 will of course detect that the number of iterations does not exceed 50 and it will then determine whether the desired convergence toward a global solution is occurring (step 134) . Algorithm 100 performs the convergence test by comparing the relative difference between E n ⁇ w and E old to some threshold level. In particular, algorithm 100 computes the absolute value of (E 0ew - E Qld )/E n ⁇ w and checks whether it is greater than 0.001.
- algorithm 100 sets the value of E o i d **° E n ⁇ w (step 135) moves onto the next iteration of the BEP training procedure (i.e., algorithm 100 branches back to step 118) .
- the training procedure gets trapped in a local minimum which causes the value of E to oscillate from one iteration to the next, it may be necessary to force the system out of that local minimum.
- the iteration count indicates when such a problem occurs by rising above 50 (see step 126) .
- algorithm 100 detects that the iteration count has exceeded 50, it "kicks" the system by boosting the learning rate to a very high number, e.g. 0.75 (step 128) . After the learning rate has been increased to 0.75, algorithm 100 performs ten iterations of the BEP training procedure (step 130) . Forcing a high learning rate during BEP training causes the system to become unstable and thus dislodges it from the local minimum.
- algorithm 100 After the tenth iteration, algorithm 100 jumps to the next higher gain and the next lower learning rate, and branches back to step 118 proceed with the BEP training with the new set of initial values for the state variables. It should be noted that in the described embodiment, algorithm 100 moves through the range of permissible gains and the range of permissible learning rates in a linear fashion, one jump at a time. Each step in gain is equal to ( - , i n i t ⁇ 0 f i na i*'/ 5 and eacn step in learning rate is equal to (7 init ⁇ 7 fina ⁇ )/ 5 - *-_ n addition, when algorithm 100 increases gain by one step, at the same time it also decreases the learning rate by one step.
- step 134 if the relative change in the magnitude of the energy does not exceed the threshold value, algorithm 100 prepares to move onto the next higher gain level. First, it sets the value of E Qld to
- Enew ( x step c 139) ' . Then, ' it checks ⁇ r to determine whether it has reached the maximum gain level allowed (step 138) . If ⁇ is less than _ ⁇ final , algorithm 100, algorithm 100 jumps to the next gain and learning rate (step 140) and then branches back to step 118 to repeat the above- described BEP training procedure.
- algorithm 100 adds a third node and again branches back to step 108 to see what effect the third node yields (step 148) .
- Algorithm 100 continues adding nodes until the resulting improvement in performance is no greater than 10%. At that point, algorithm selects the structure and values of the state variables which yielded the lowest energy and terminates.
- the node function partitions different regions of the input space by constructing hyperplanes to approximate the region boundaries.
- the input to each node given by Eq. 2, is a linear equation for a plane in multidimensional space. As more hidden layer nodes are used the actual boundaries are more closely approximated.
- the "sigmoid" transfer function implements a sharp or fuzzy boundary based on a high or low magnitude of the gain term in Eq. 1. An important characteristic of the sigmoid function is that it acts globally across the input space, thereby allowing the possibility of forming a compact representation of the salient features in the training data.
- An MLP network trained with the GDGA algorithm was evaluated using actual radar signatures.
- the task was to separate the radar signatures into two classes; object types A and B.
- object types A and B The problem is difficult, because the effects of object geometry and measurement conditions on the radar signature are not well characterized. As a consequence the signatures are not easy to discriminate, and it is not clear which transformation will increase the separability.
- a data set consisting of 3692 radar signatures (equal numbers of types A and B) was used in this study.
- the training procedure consisted of initializing the network weights and offsets to a set of -sendom values and presenting a certain percent of the data ⁇ et in a random order to train the network. After training, the weights and offsets were fixed and the entire data set (3592 signatures) was used to test the network. The combination of training and then testing the network is defined as a trial.
- a specific MLP network was evaluated by running 100 trials, where the weights and offsets are initialized to a different set of random values at each trial. Each trial also selected a different (random) training set. The performance was defined as the percent of the input patterns correctly classified, based on the distance between the network output and a set of outputs for the two signature types. The network had a single output node that has output values of 0.95 for type A and 0.05 for type B signatures. For each input pattern the target class with the minimum distance (Euclidean) to the network output was chosen as the pattern class. The trial with the maximum percent correct during testing is used for performance comparisons. The percent of the data set used for training and the number of nodes in the hidden layer were treated as independent parameters in the experiment. Fig.
- Fig. 4 where the percent correct during training for a single node network is shown as a function of the percent of the data set used for training.
- the limited capacity of a single hidden layer node is shown by the decrease in training set performance from 100% to 93% correct. This performance decrease occurred when the training set size was increased from 1% to 25% of the entire data set.
- the testing performance is significantly increased by adding nodes to the hidden layer. After five nodes though, any further node addition only slightly improves the performance.
- the ten node network could account for 95% of the data after training on 20% of the data and was able to attain 97% correct during testing as the training percentage is increased. Further investigation showed that the remaining 3% of the data that the network could not account for were actually bad measurements. Apparently the network was able to discriminate between signatures and also identify a non-signature without being explicitly trained as to what constitutes a bad measurement.
- the network was also trained with the standard BEP algorithm and the testing results are shown in Fig. 5.
- the generalization capability of the network trained with BEP was less than a network trained with the GDGA algorithm. This effect was most pronounced when training on a very small percent of the data set (a situation that is especially relevant to real world problems) .
- the GDGA technique is able to explore the state space of the network more thoroughly at the low gain values than BEP training operating at a single gain.
- the BEP algorithm was initialized to the optimum value of gain found by training with the GDGA algorithm.
- the "history" of training at many different gain values was apparently significant to the generalization capability of the network.
- Both the BEP and GDGA algorithms required approximately the same number of iterations for training, (approx.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Feedback Control In General (AREA)
Abstract
Procédé assurant l'apprentissage d'un réseau neuronal (1) possédant une couche de sortie (20) et au moins une couche intermédiaire (16) comprenant un ou plusieurs n÷uds internes (18) dont chacun est caractérisé par une fonction d'activation de n÷ud possédant un gain. Le procédé consiste à sélectionner le gain sur au moins une partie des n÷uds internes de sorte qu'il soit égal à une valeur initiale de gain; à faire l'apprentissage du perceptron multicouche à partir de la valeur du gain; et à modifier le gain sur au moins une partie des n÷uds internes pendant l'apprentissage, ladite modification du gain allant dans le sens d'une augmentation de la sensibilité du perceptron multicouche.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US67922591A | 1991-04-02 | 1991-04-02 | |
| US679,225 | 1991-04-02 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO1992017849A1 true WO1992017849A1 (fr) | 1992-10-15 |
Family
ID=24726072
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US1992/002796 Ceased WO1992017849A1 (fr) | 1991-04-02 | 1992-04-01 | Conception automatique de processeurs de signaux a l'aide de reseaux neuronaux |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO1992017849A1 (fr) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2001020364A1 (fr) * | 1999-09-10 | 2001-03-22 | Henning Trappe | Procede de donnees de mesure sismiques au moyen d'un reseau neuronal |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5033006A (en) * | 1989-03-13 | 1991-07-16 | Sharp Kabushiki Kaisha | Self-extending neural-network |
-
1992
- 1992-04-01 WO PCT/US1992/002796 patent/WO1992017849A1/fr not_active Ceased
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5033006A (en) * | 1989-03-13 | 1991-07-16 | Sharp Kabushiki Kaisha | Self-extending neural-network |
Non-Patent Citations (2)
| Title |
|---|
| RUMELHART et al., "Learning Internal Representations by Error Propagation", PARALLEL DISTRIBUTED PROCESSING, Volume 1, Foundations, MIT Press, 1986. * |
| VOGL et al., "Accelerating the Convergence of Back Progation Method", BIOLOGICAL CYBERNETICS, SPRINGER-VERLOG, 1988, Page 250, 259, 260. * |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2001020364A1 (fr) * | 1999-09-10 | 2001-03-22 | Henning Trappe | Procede de donnees de mesure sismiques au moyen d'un reseau neuronal |
| US6725163B1 (en) | 1999-09-10 | 2004-04-20 | Henning Trappe | Method for processing seismic measured data with a neuronal network |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Pal et al. | Multilayer perceptron, fuzzy sets, and classification | |
| US6167390A (en) | Facet classification neural network | |
| Sutton et al. | Online learning with random representations. | |
| Murray et al. | Synaptic weight noise during multilayer perceptron training: fault tolerance and training improvements | |
| Maclin et al. | Combining the predictions of multiple classifiers: Using competitive learning to initialize neural networks | |
| Denoeux et al. | Initializing back propagation networks with prototypes | |
| Billings et al. | The determination of multivariable nonlinear models for dynamic systems using neural networks | |
| US5943661A (en) | Hybrid neural network classifier, systems and methods | |
| Yoon et al. | Training algorithm with incomplete data for feed-forward neural networks | |
| US5469530A (en) | Unsupervised training method for a neural net and a neural net classifier device | |
| US6965885B2 (en) | Self-organizing feature map with improved performance by non-monotonic variation of the learning rate | |
| Du et al. | Multilayer perceptrons: architecture and error backpropagation | |
| Lee et al. | A two-stage neural network approach for ARMA model identification with ESACF | |
| WO1992017849A1 (fr) | Conception automatique de processeurs de signaux a l'aide de reseaux neuronaux | |
| Moreno et al. | Efficient adaptive learning for classification tasks with binary units | |
| Kia et al. | Unsupervised clustering and centroid estimation using dynamic competitive learning | |
| Taheri et al. | Artificial neural networks | |
| WO1991002322A1 (fr) | Reseau neural de propagation de configurations | |
| Karouia et al. | Performance analysis of a MLP weight initialization algorithm. | |
| Wann et al. | Clustering with unsupervised learning neural networks: a comparative study | |
| Hartono et al. | Adaptive neural network ensemble that learns from imperfect supervisor | |
| Owens et al. | A multi-output-layer perceptron | |
| Kim et al. | Pattern classification of vibration signatures using unsupervised artificial neural network | |
| de Paula Canuto | Combining neural networks and fuzzy logic for applications in character recognition | |
| Villalobos et al. | Learning Evaluation and Pruning Techniques |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AK | Designated states |
Kind code of ref document: A1 Designated state(s): CA JP |
|
| AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH DE DK ES FR GB GR IT LU MC NL SE |
|
| DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
| 122 | Ep: pct application non-entry in european phase | ||
| NENP | Non-entry into the national phase |
Ref country code: CA |