WO2016058055A1

WO2016058055A1 - Trainable analogue block

Info

Publication number: WO2016058055A1
Application number: PCT/AU2015/050638
Authority: WO
Inventors: Floris André Van SCHAIK; Tara Julia HAMILTON; Jonathon Craig TAPSON; Chetan Singh THAKUR
Original assignee: University of Western Sydney
Current assignee: University of Western Sydney
Priority date: 2014-10-17
Filing date: 2015-10-16
Publication date: 2016-04-21
Anticipated expiration: 2017-04-17

Abstract

A trainable analogue circuit block (TAB) (10) includes an input layer (12) defining at least one input node (14). At least one hidden layer (16) defines at least one hidden node (20). A non-linear circuit connects the at least one hidden layer (16) to the input layer (12), the non-linear circuit comprising at least one non-linear weight (18). An output layer (22) defines at least one output node (24), the output layer (22) being connected to the at least one hidden layer (20) via at least one trainable weight (26).

Description

"Trainable analogue block"

Cross-Reference to Related Applications

[0001] The present application claims priority from Australian Provisional Patent

Application No 2014904154 filed on 17 October 2014, the contents of which are incorporated herein by reference.

Technical Field

[0002] This disclosure relates, generally, to trainable circuits and, more particularly, to a trainable analogue circuit block (referred to as a trainable analogue block (TAB)) and a method of training trainable weights of a TAB.

Background

[0003] Analogue circuits are an essential part of most electronic systems. Any electronic system that interacts with the physical world through sensors, actuators or for signal transmission needs to be able to sense or to generate analogue signals. Often, real advantages in terms of speed or power consumption can be obtained by doing some of the signal processing for these signals in the analogue domain itself. However, as integrated circuit (IC) manufacturing techniques move to smaller and smaller feature sizes (now down to less than 25nm in the most advanced technologies), random variations in size between transistors become a serious problem, particularly for the design of analogue circuits.

[0004] In the past decade a number of algorithms have arisen in the neuromorphic engineering, machine learning and artificial intelligence communities which make use of massive parallelism with random projections and nonlinear operations to implement stochastic computations. Two such algorithms are the Extreme Learning Machine and the Neural Engineering Framework. Software implementations of these algorithms have been used extensively in function regression and classification problems with considerable success. To date, the same success has not been achieved via hardware implementations. Summary

[0005] In a first aspect, there is provided a trainable analogue circuit block (TAB) which includes

an input layer defining at least one input node;

at least one hidden layer defining at least one hidden node;

a non-linear circuit connecting the at least one hidden layer to the input layer, the non-linear circuit comprising at least one non-linear weight; and

an output layer defining at least one output node, the output layer being connected to the at least one hidden layer via at least one trainable weight.

[0006] In this specification, unless the context clearly indicates otherwise, the term

"nonlinear" as it relates to a device, whether a weight or a circuit, means that an output of the device is not directly proportional to an input into the device.

[0007] The at least one hidden layer may define a plurality of hidden nodes, there being a larger number of hidden nodes than there are input nodes in the input layer so that at least one input signal input into the input layer is projected to a higher dimension in the at least one hidden layer via the non-linear weights.

[0008] The at least one hidden node may be non-linear. At least one of the non-linear circuit and the non-linear hidden nodes may have a random component based on intrinsic

randomness in a VLSI process and nonlinearities inherent in transistors.

[0009] The TAB may include a controllable offset applied to at least one of the at least one non-linear weight and the at least one hidden node. The controllable offset may be

implemented as a distributed resistor element functioning as a voltage divider and generating different reference voltages to be input into at least one of the at least one non-linear weight and the at least one hidden node.

[0010] The at least one node of the output layer may be linear.

[0011] Output signals of the hidden layer and the output layer may be represented as currents whereas the at least one input signal may be represented as a voltage. Further, all nodes may be simple nodes in a circuit with connections between the nodes being implemented with actual transistors.

[0012] The non-linear circuit may be implemented as one of an inverter and a differential pair. The inverter can be implemented with two transistors only and its output current approximates a tangent function (tan) of the input voltage relative to a threshold that varies as a function of transistor mismatch. The differential pair may be implemented with three transistors - two for the pair and one for a bias transistor. The output current is a hyperbolic tangent function (tank) of the input voltage subject to random offset and gain from transistor mismatch.

[0013] The at least one trainable weight connecting the output layer to the at least one hidden layer may be a linear weight.

[0014] In one embodiment the TAB may do on-chip training of the at least one weight. Thus, the at least one trainable weight may be trained by a weight update rule. The weight update rule may increment or decrement the at least one weight by an amount depending on a product of a sign of an error value between a desired output value and an actual output value and a sign of a hidden layer activation unit with the at least one weight being digitised and stored in a counter that counts up or down one unit depending on the product of the signs.

[0015] The weight update rule may be configured to slowly decrease a unit of weight change during training so that it takes larger steps at the beginning than at the end to speed up training time.

[0016] In another embodiment the TAB may conduct off -chip training of the at least one weight. Thus, the TAB may be in communication with an external computational device, the computational device calculating the at least one weight based on a measurement of an activation of the at least one hidden layer over a desired range of input values and using an error value between a desired output value and an actual output value to program the trainable weight with the externally calculated values.

[0017] If the output weights and the output nodes are linear, then the computational device may calculate the weights using the measurement of an activation of the at least one hidden layer over a desired range of input values and the desired output values for the range of desired input values without ever measuring the actual output value. This may be a good first step in determining the output weights.

[0018] The computational device may employ an iterative process with a weight being calculated, programmed into the TAB and the process then repeated. The input layer may include a plurality of input nodes and the at least one output node may be connected to at least one of the input nodes to form a recurrent network.

[0019] In a second aspect, there is provided a trainable analogue circuit block (TAB) which includes

an input layer defining at least one input node;

at least one hidden layer defining at least one hidden node;

a non-linear circuit connecting the at least one hidden layer to the input layer, the non-linear circuit comprising at least one non-linear weight;

an output layer defining at least one output node, the output layer being connected to the at least one hidden layer via at least one trainable weight; and

a controllable offset applied to at least one of the non-linear weights and the hidden layer.

[0020] The controllable offset may be implemented as a distributed resistor element functioning as a voltage divider and generating different reference voltages to be input into at least one of the at least one non-linear weight and the at least one hidden node.

[0021] In a third aspect, there is provided a trainable analogue circuit block (TAB) which includes

an input layer defining at least one input node;

at least one hidden layer defining at least one hidden node;

a non-linear circuit connecting the at least one hidden layer to the input layer, the non-linear circuit comprising at least one weight and

an output layer defining at least one output node, the output layer being connected to the at least one hidden layer via a at least one trainable weight and training of the at least one trainable weight being implemented via a training rule which: - applies a training input signal; generates an output signal;

determines an error value indicative of an error between the output signal and a desired signal;

selectively increments and decrements at least one weighting control signal based on the error value; and

applies the at least one weighting control signal to an intermediate current signal to cause convergence between the output signal and the desired signal.

[0022] The training rule may increment or decrement the at least one trainable weight by an amount depending on a product of a sign of the error value between a desired value of the output signal and an actual value of the output signal and a sign of a hidden layer activation unit with the at least one weight being digitised and stored in a counter that counts up or down one unit depending on the product of the signs.

[0023] In a fourth aspect, there is provided a method of training trainable weights of a trainable analogue circuit block (TAB), the method including

applying a training input signal;

generating an output signal;

determining an error value indicative of an error between the output signal and a desired signal;

selectively incrementing and decrementing at least one weighting control signal based on the error value; and

applying the at least one weighting control signal to an intermediate current signal to cause convergence between the output signal and the desired signal.

[0024] The method may include incrementing or decrementing a trainable weight by an amount depending on a product of a sign of the error value between a desired value of the output signal and an actual value of the output signal and a sign of a hidden layer activation unit and digitising and storing the weight in a counter that counts up or down one unit depending on the product of the signs. Brief Description of Drawings

[0025] Embodiments of the disclosure will now be described by way of example with reference to the accompanying drawings in which: -

[0026] Fig. 1 shows a schematic representation of the structure of an embodiment of a trainable analogue block (TAB);

[0027] Fig. 2 shows a schematic diagram of a non-linear circuit used in the TAB;

[0028] Fig. 3 shows a schematic diagram of a current splitter circuit used in the TAB;

[0029] Fig. 4 shows a schematic diagram of a digital block used together with the current splitter circuit to calculate output weights to be applied to output nodes of an output layer of the TAB;

[0030] Fig. 5 shows a schematic diagram of an implementation of a combination of a nonlinear weight and a node block of the hidden layer of the TAB;

[0031] Fig. 6 shows a schematic diagram of a top level design of the TAB;

[0032] Fig. 7 shows an Input/Output diagram of the top level design of the TAB of Fig. 6;

[0033] Figs. 8-11 show other embodiments of a trainable analogue block (TAB);

[0034] Fig. 12 shows a result of an on chip training of trainable output weights of the TAB and an error curve associated with the training of the output weights; and

[0035] Fig. 13 shows an input curve and an output curve of an off chip training regime associated with trainable output weights of the TAB.

Detailed Description of Exemplary Embodiments

[0036] In Fig. 1 of the drawings, reference numeral 10 generally designates a first embodiment of a trainable analogue circuit block (TAB). The TAB 10 includes an input lay 12 defining an input node 14. A hidden layer 16 is connected to the input layer 12 through a a non-linear circuit in the form of a plurality of non-linear weights 18. The hidden layer 16 defines a plurality of hidden nodes 20. In the illustrated embodiment, these hidden nodes 20 are non-linear. The nonlinearity of the hidden nodes 20 has a random component as a result of device mismatch from the manufacturing process.

[0037] An output layer 22 defining an output node 24 is connected to the hidden layer 16 via a plurality of trainable weights 26. In the illustrated embodiment, the trainable weights 26 are linear as is the node 24 of the output layer 22.

[0038] As illustrated, there are a larger number of hidden nodes 20 in the hidden layer 16 than there are input nodes 14 in the input layer 12 so that at least one input signal input into the input layer 12 is projected to a higher dimension in the hidden layer 16 via the weights 18. As shown in the embodiment of Fig. 1, the TAB 10 has a single input node 14 in its input layer 12 connected to multiple hidden nodes 20 of the hidden layer 16. These multiple hidden nodes 20 map down to a single output node 24 of the output layer 22.

[0039] In this embodiment, a controllable offset 28, which is implemented as a distributed polysilicon resistor element functioning as a voltage divider is applied to the non-linear weights 18 of the TAB 10. In other embodiments (Figs. 8, 10 and 11), the controllable offset 28 is applied to the hidden nodes 20 of the hidden layer 16 of the TAB 10. . The controllable offset 28 generates different reference voltages to be input into the hidden nodes 20 of the hidden layer 16.

[0040] It is noted that output signals of the hidden layer 16 as well as of the output layer 22 are represented as currents whereas the input signal is represented as a voltage.

[0041] Each non-linearity, whether in the weight 18 or the hidden node 20, is implemented as a differential pair using three transistors. Two of the transistors make up the differential pair with one transistor operating as a bias transistor. An output current from the differential pair is a hyperbolic tangent function (tank) of the input voltage subject to random offset and gain from transistor mismatch. [0042] In another embodiment (not shown), each non-linearity may be implemented as an inverter with two transistors only and its output current approximates a tangent function (tan) of the input voltage relative to a threshold that varies as a function of transistor mismatch.

[0043] An example of a non-linear circuit is shown in greater detail in Fig. 2 of the drawings and is designated generally by reference numeral 29. This circuit 29 comprises a differential pair having a pair of transistors 30 and 32 and a third, bias transistor 34, all implemented as MOSFETs. In the case of the embodiment shown in Figs.l and 9 (i.e. having a single input node 14 in the input layer 12) these transistors 30, 32, 34 constitute the non-linear weight 18 and the rest of the circuit of Fig 2 forms part of the associated hidden node 20 of the hidden layer 16 as the hidden node. This rest of the circuit is then linear (with some distortion) in generating I_tanh as the hidden node output.

[0044] The circuit 29 is implemented in VLSI technology. Furthermore, the TAB 10 is designed for implementation in the TSMC 65nm technology with chip supply voltage, VDD as 1.2V.

[0045] The sharing of currents between transistors 30 and 32 depends on their respective gate voltage, V_in (input voltage) and V_ref (constant reference voltage). If all the transistors operate below this threshold and at saturation and on the assumption that transistors 30 and 32 have the same sub-threshold slope factor, k_n, then currents in the transistors 30 and 32 can be approximated as follows: -

/; = I_b[exp(k_n*V_in/U_T)] / [exp(k_n*V_in/U_T) + exp(k_n*V_re/U_T)] (1) h = I_b[exp(k_n*V_re/U_T)] / [exp(k_n*V_in/U_T) + exp(k_n*V_re/U_T)] (2)

[0046] The curve profiles for the output currents, /; and I₂, are functions of the input differential voltage between Vj_n and V_ref and are similar to the mathematical tanh curve. The circuit 29 therefore behaves like a tanh function if the differential voltage is symmetrical across zero voltage. /; saturates to maximum bias current if V_in is higher than V_re/by more than 4 U_T (100 mV). In the TAB 10, each weight 18 or hidden node 20 exhibits a different Vref, which results in a different nonlinear curve for each weight 18 or each hidden node 20. [0047] l_tanh is copied from /; using a current mirror consisting of two transistors 106, 108 that is further used in a current splitter circuit 38 (Figure 3). /; is further mirrored using two current mirrors, consisting of transistors 106, 110 and transistors 112, 114 to an output node where it is compared with I₂. is mirrored to the output node via a current mirror consisting of transistors 116, 118 to determine the sign of the hidden node signH, signifying the node type as either excitatory or inhibitory. V_¾ is the voltage at the transistor 34 that contributes to the bias current in it and this bias current lies in the range of a few nanoamperes (nA).

[0048] In Fig. 3 of the drawings, a current splitter circuit 38, which is used together with a digital block 40 (Fig. 4), to calculate the output weights 26 is illustrated. In the illustrated embodiment, the hidden nodes 20 of the hidden layer 16 are connected via linear, binary weighted weights 26 to the output node 24 of the output layer 22. Further, in the case of this embodiment, being a single input/single output device, there are thirteen stages 42 in the connection between each hidden node 20 of the hidden layer 16 and the output node 24 of the output layer 22. These stages 42 control the amount of current that flows from the hidden nodes 20 to the output node 24 and are implemented by way of the current splitter circuit 38 with thirteen stages 42.

[0049] The current in each stage 42 is controlled via a digital switch of the digital block 40 (Fig. 4). The input current I_tanh from the circuit 29 is divided successively to form a geometrically- spaced series of small currents. At each stage 42, a fixed fraction of the current is split off with the remainder of the current continuing to following stages 42. The last stage 42 is configured to terminate the connection as though it were infinitely long.

[0050] The current splitter principle of the circuit 38 splits currents over all the operating ranges of the transistors, from weak to strong inversion, reasonably accurately as determined by the geometry of the transistors in each stage 42 of the circuit 38.

[0051] Each stage 42 is constituted by a transistor 44 and two further transistors 46 and the connection is terminated with still a further transistor 48. As described above, in the illustrated embodiment, the circuit 38 has thirteen stages 42. In general, the circuit 38 can have N stages, the current at the k^th stage being I_tanh/2^Ak. The final current is the same as the penultimate current. Further, the transistors 44, 46 and 48 are all of equal size. [0052] Each transistor 44, 46 and 48 is provided, at its p-FET gate, with a reference voltage being a master bias voltage, V_gbias-

[0053] Each stage 42 has two transistors switches 50 to route stage current either to useful current, I_good, as shown at 52, or to ground current, ump, as shown at 54. I_good is mirrored in a current mirror 56 to generate I_ouh as shown at 58, for the output node 24. /_OMiis further routed to 7_OMiN, as shown at 60, or 7_OMiP, as shown at 62, depending on the state of a signW switch 64 (Fig. 4) of the digital block 40 which determines a sign of the weight 26 along with a value of the weight 26.

[0054] The digital block 40 determines the magnitude and polarity of each weight 26 that controls its associated circuit 38. In this embodiment, the TAB 10 uses an on-chip training technique for training the weights 26. This training technique or training rule involves applying a training input signal, generating an output signal and determining an error value indicative of an error between the generated output signal and a desired output signal. The rule further involves selectively incrementing or decrementing at least one weighting control signal based on the determined error value. The at least one weighting control signal is applied to an intermediate current signal to cause convergence between the generated output signal and the desired output signal.

[0055] The incrementing/decrementing step involves incrementing or decrementing the weight 26 by an amount depending on a product of a sign of the error value between the generated, actual value of the output signal and the desired value of the output signal as well as a sign of a hidden layer node 20 of the hidden layer 16. The weight 26 is digitised and stored in a counter of the digital block 40 that counts up or down one unit depending on the product of the signs to provide a value of the weight 26.

[0056] The digital block 40 implements a learning algorithm such as the Ada2 learning algorithm which, as indicated above, uses a sign of the hidden layer node 20 of the hidden layer 16 and the sign of the error value to update the magnitude and polarity of the relevant weight 26.

[0057] The digital block 40 contains internal shift registers and all the shift registers of the hidden nodes 20 of the hidden layer are connected serially as a long chain. The shift registers provide flexibility to use off-chip calculated weights, if desired, instead of the above described on-chip generated weights. A top-level pin, CTRL_MUX, 66 (Fig. 7) determines whether to use off-chip weights or on-chip weights. Further, the digital block 40 generates an overflow bit, as shown at 67 (Fig. 4) when a counter of the digital block 40 reaches its maximum value.

[0058] The clock in the digital block 40 is synchronised with input data. For each packet of input data, the digital block 40 for each hidden node 20 updates its counters which then act as weights. For the illustrated embodiment of a thirteen stage circuit 38, each counter of the digital block 40 is a 13 bit counter with each bit of the counter controlling one stage 42 of the circuit 38.

[0059] In Fig. 5 of the drawings, an implementation of a combination of one of the nonlinear weights 18 and one of the hidden nodes 20 of the hidden layer 16 is illustrated and is designated generally by the reference numeral 68. The implementation 68 is a combination of the circuit 29, the circuit 38 and the digital block 40 for each connection. In the present embodiment, the TAB 10 comprises 456 node blocks 68, one of which is a test node block.

[0060] At any particular time, each node block 68 receives the same input voltage, ½„, but different reference voltages, V_ref, on the differential pairs 30, 32 of each circuit 29. This results in different differential voltages for each node block 68 and, as a result, different currents, I_tanh, generated for each node block 68.

[0061] As described above, a controllable offset 28, which is implemented as a distributed polysilicon resistor element functioning as a voltage divider is applied either to each weight 18 or to each hidden node 20 of the hidden layer 16. More particularly, each V_re/ is tapped from the distributed polysilicon resistor element. End points of this resistor element are connected to top-level voltage pins, V_re/i and V_refi, as shown at 70 and 72, respectively, in Fig. 7 of the drawings. The polysilicon resistor element acts as a voltage divider which generates different reference voltages, V_ref, for each node block 68.

[0062] For each new input, the block 29 calculates I_tanh and signH which is passed to the circuit 38 and the digital block 40, respectively. The digital block 40, as described above, calculates the output weight 26 based on a sign of the output error, signErr, which comes from a top-level pin 74 (Fig. 7) and a sign of the hidden layer node, signH. The output weight from the digital block 40 acts as a switch in the circuit 38 to control the current.

[0063] As described above, each stage 42 of the circuit 38 has two transistor switches 50 to route branch current to either useful current, I_good, or to Idump which goes to ground. I_good is mirrored to make I_out. l_ou in turn, is further routed to currents I_outP or /_OMiN, as determined by the polarity of a sign W signal.

[0064] I_outP and I_outN currents of each node block 68 are globally connected to each other and summed to provide the final current which is the output current of the TAB 10.

[0065] Fig. 6 shows a top-level layout 76 of the TAB 10. The TAB 10 contains 455 node blocks 68 and a test node block 77. Current mirrors are provided to amplify I_outP and I_outN to a micro-ampere signal range. I_outP and I_outN are further connected together to create top-level currents 7_OMiP_t and 7_OMiN_t 78 and 80, respectively. These top-level currents 78 and 80 are further amplified, using current mirrors 82, to form resulting currents 7_OMiP_ta and 7_OMiN_ta as shown at 84 and 86 respectively. The currents 84 and 86 are merged to create the final output current of the TAB 10 which is output on line 88.

[0066] Additional current mirrors 90 and 92 of the test node block 77 amplify I_good and Idump- The test node block 77 also has 3 bit shift registers that are connected to the shift registers of each digital block 40. This is used to program the number that is being added to the counter in the digital block 40 of each hidden node 20. The counter of each digital block 40 is incremented or decremented based on this number, the default counter increment being 1 when all three bits are zero.

[0067] The overflow signal 67 of each digital block 40 is connected to wire AND topology. Assertion of an overflow bit is an indication of an error in the training in that the solution does not converge. In normal operating conditions, the overflow bit of all the node blocks 68 should be low.

[0068] Fig. 7 shows an input/output diagram of the top-level layout 76 of the TAB 10. The various pins of the top-level layout 76 are itemised in the table below. [0069] I/O Type Direction Description

VDD Analogue Input Supply voltage,

1.2V

GND Analogue Input Ground

VIN Analogue Input Data input (0.7+- 0.25 V, i.e. 0.45 V to 0.95 V)

VREF1 Analogue Input One end of the polyline is connected to Vrefi voltage that is used to generate different voltages for each Vref input of the tanh block (0.7+-0.25 V, i.e. 0.3 V to 1.1 V)

VREF2 Analogue Input Other end of the polyline is connected to V_ref2 voltage that is used to generate different voltages for each Vref input of the tanh block (0.7+-0.25 V, i.e. 0.3 V to 1.1 V)

VG_R2R Analogue Input Bias voltage for the R2R block (- 0.5 V to 0.5V)

IB_TANH Analogue Input Voltage/current for bias transistor of the tanh block

(Voltage range hundreds of mV; current range tens of nA)

SIGN_ERR Digital Input Sign of the error signal, generated externally

[Error=target- observed (VDD or GND)]

CTRL_MUX Digital Input To select weight either from counter or shift register

SHIFTJN Digital Input Input of the shift register chain

CLK_SHIFT Digital Input Clock for the shift registers (frequency: 1-10 MHz)

SHIFT_OUT Digital Output Output of the shift register chain

circuit

[0070] The above embodiment describes a single input/single output (SISO) implementation of the TAB 10. An example of such an implementation is as a transconductance circuit which can be used as a voltage-to-current converter, or the like. A benefit of implementing a transconductance circuit by way of the TAB 10 is that the TAB 10 has an arbitrary relationship between an input voltage and output current. Whereas in current analogue circuit design, such current-voltage relationships are carefully crafted using transistors and a lengthy design process, the relationship is simply trainable in the TAB 10. This significantly reduces the design cycle for analogue circuits. Also, as described above, traditional analogue circuits remain sensitive to transistor mismatch and need to be designed to mitigate actively against such mismatch. Conversely, in the case of the present TAB 10, transistor mismatch is essential for correct operation of the TAB 10. The training process ensures that the correct function is learnt, taking into account the exact transistor mismatch in the particular TAB 10 being trained.

[0071] Fig. 8 shows a variation of the TAB 10. With reference to Fig. 1 of the drawings, like reference numerals refer to like parts, unless otherwise specified. In this embodiment, the input layer 12 comprises a plurality of input nodes 14. Similarly, the output layer 22 comprises a plurality of output nodes 24.

[0072] Such a TAB 10 having multiple input nodes 14 can be used to learn functions of multiple variables, such as multiplications, divisions, polynomial relations and Euclidean distance. This can then be used to extract correlations, modulate and demodulate signals and perform error measurements.

[0073] Further, multiple output nodes 24 effectively use the same projection from the higher dimension of the hidden layer 16 to learn different input-output mapping. Multiple output nodes 24 are also useful when a TAB 10 is used as a classifier. In such a case each output node 24 can be trained to indicate how likely the input signal or signals belong to the category which each output node 24 is trained to represent.

[0074] An example of such a TAB 10 incorporates at least twenty input nodes 14 in the input layer 12 and at least ten output nodes 24 in the output layer 22. Such a TAB 10 is used as a trainable classifier where the outputs can be trained to be either high or low, depending on what pattern of input is presented to the input nodes 14 of the input layer 12. The TAB 10 performs the classification directly on the analogue inputs without needing to digitise them.

[0075] A TAB 10 having a larger array of input nodes 14 in the input layer 12 and a larger array of output nodes 24 in the output layer 22 could be used in smart devices such as a hand written digit classifier.

[0076] A TAB 10 in which the input nodes 14 receive differently delayed versions of at least one input signal can classify temporal signals which can be used, for example, for heartbeat classification in a defibrillator. [0077] Fig. 9 shows a further embodiment of the TAB 10. Once again, with reference to the previous drawings, like reference numerals refer to like parts, unless otherwise specified. In this embodiment, the TAB 10 has a single input node 14 in the input layer 12 with multiple output nodes 24 in the output layer 22.

[0078] A single input signal is supplied to the input node 14 of the input layer 12 and the TAB 10 is trained to map the single input value to multiple output values at the output nodes 24 in the output layer 22 simultaneously.

[0079] Fig. 10 of the drawings shows a further embodiment of the TAB 10 and, as in the previous cases, with reference to the previous drawings, like reference numerals refer to like parts, unless otherwise specified.

[0080] In this embodiment, the TAB 10 has a plurality of input nodes 14 in the input layer 12 but only a single output node 24 in the output layer 22. In this embodiment, a 1 dimensional output is created from multi-dimensional inputs. For example, a 2 input - 1 output TAB 10 provides an analogue multiplier, or a variable gain amplifier, where one of the inputs is used to control the gain.

[0081] Fig. 11 of the drawings shows still a further embodiment of the TAB 10. Once again, with reference to the previous drawings, like reference numerals refer to like parts, unless otherwise specified.

[0082] This version of the TAB 10 is similar to the version shown in Fig. 8 of the drawings. Thus, the input layer 12 has multiple input nodes 14 and the output layer 22 has multiple output nodes 24.

[0083] In this embodiment, a feedback loop 94 feeds back an output signal from one of the output nodes 24 of the output layer 22 to one of the input nodes 14 of the input layer 12. The TAB 10 is used as a filter with or without the feedback loop 94. As for the temporal signal classifier described above, the TAB 10 can be trained to implement the analogue equivalent of a finite impulse response (FIR) filter by presenting time delayed samples of an input signal to an input node 14 of the input layer 12. Thus signals of x[t], x[t-l], x[t-2] ... are presented to many input nodes simultaneously, one value per input node by tapping the input nodes off a delay line.

[0084] If delayed versions of the output signal are fed back to some of the input nodes 14 of the input layer 12 via the feedback loop 94, an analogue equivalent of an infinite impulse response (IIR) filter can be implemented.

[0085] Still further, if analogue, continuous time delay lines are provided on-chip, all filters can be implemented without needing to sample the signal in either amplitude or time. The TAB 10 provides the additional advantage that the filter, implemented by way of the TAB 10, can be re-trained if the desired filter function changes.

[0086] Fig. 12 shows a graphic representation on-chip training of the TAB 10 where a sine function is generated as shown at 96. The TAB 10 can be trained, using the circuit 38 and the digital block 40 and the weight training rule, to obtain an output closely matching a desired or target output as represented by line 98. The reduction in the error during the on-chip training procedure is shown at 100 in Fig. 12 of the drawings.

[0087] Fig. 13 shows an off-chip training implementation of the TAB 10. In this

embodiment, the TAB 10 communicates with an external computational device, such as a microprocessor (not shown). The microprocessor calculates the weights based on a measurement of activation of the hidden nodes 20 of the hidden layer 16over a range of desired input values. The microprocessor uses measurement of activation of the hidden nodes 20 and a set of desired output values for the range of desired input values to calculate the desired weights and to program the trainable weights 28 with the externally calculated weights in order for the TAB 10 to output the desired output value.

[0088] An example of this is shown in Fig. 13 of the drawings with an input signal 102 input into the input layer 12 and a trained sine function 104 being output at the output layer 22 of the TAB 10.

[0089] It is a particular advantage of the described TAB 10 that the designer of the TAB 10 need not worry about device matching to obtain the required performance of the integrated circuit implemented by way of the TAB 10. [0090] As described above, another advantage of the TAB 10 is that it can be used in numerous different applications by suitably modifying the number of input nodes in the input layer and/or the number of output nodes in the output layer. Still further, by providing a feedback loop various other benefits, such as the provision of suitable filters, can be obtained.

[0091] A further advantage of the TAB 10 is that the same TAB 10 can be reused for many different purposes once manufactured and the same architecture can be used in different manufacturing technologies. This leads to significantly reduced design cycles for analogue circuits with an associated reduction in design cost. The TAB 10 can also be trained or retrained "on the job". This provides a major advantage in systems where the input/output mapping of a circuit needs to be changed because of changes in the system. An example of this is in a communications system where a TAB is used as a filter to process the analogue signal before digitisation and in which the communications channel changes over time. The TAB can be re-trained with the communications channel in the loop to compensate for these changes.

[0092] It will be appreciated that, while the disclosure has been described by way of input voltages producing output currents, the described embodiments can be implemented to convert input currents to output voltages using appropriate circuit elements. Thus, the TAB is able to accept an input current and/or produce an output voltage by way of such an implementation.

[0093] It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the above-described embodiments, without departing from the broad general scope of the present disclosure. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive.

Claims

CLAIMS:

1. A trainable analogue circuit block (TAB) which includes

an input layer defining at least one input node;

at least one hidden layer defining at least one hidden node;

2. The TAB of claim 1 in which the at least one hidden layer defines a plurality of hidden nodes, there being a larger number of hidden nodes than there are input nodes in the input layer so that at least one input signal input into the input layer is projected to a higher dimension in the at least one hidden layer via the non-linear weights.

3. The TAB of claim 1 or claim 2 in which the at least one hidden node is non-linear.

4. The TAB of claim 3 in which at least one of the non-linear circuit and the non-linear hidden nodes have a random component based on intrinsic randomness in a VLSI process and nonlinearities inherent in transistors.

5. The TAB of any one of the preceding claims which includes a controllable offset applied to at least one of the at least one non-linear weight and the at least one hidden node.

6. The TAB of claim 5 in which the controllable offset is implemented as a distributed resistor element functioning as a voltage divider and generating different reference voltages to be input into at least one of the at least one non-linear weight and the at least one hidden node.

7. The TAB of any one of the preceding claims in which the at least one node of the output layer is linear.

8. The TAB of any one of the preceding claims in which all nodes are simple nodes in a circuit with connections between the nodes being implemented as actual transistors.

9. The TAB of any one of the preceding claims in which the non-linear circuit is implemented as one of an inverter and a differential pair.

10. The TAB of any one of the preceding claims in which the at least one trainable weight connecting the output layer to the at least one hidden layer is a linear weight.

11. The TAB of any one of the preceding claims in which the at least one trainable weight is trained by a weight update rule.

12. The TAB of claim 11 in which the weight update rule increments or decrements the at least one trainable weight by an amount depending on a product of a sign of an error value between a desired output value and an actual output value and a sign of a hidden layer activation unit with the at least one weight being digitised and stored in a counter that counts up or down one unit depending on the product of the signs.

13. The TAB of claim 12 in which the weight update rule is configured to decrease a unit of weight change during training so that it takes larger steps at the beginning than at the end to speed up training time.

14. The TAB of any one of claims 1 to 10 which is in communication with an external computational device, the computational device calculating the at least one weight based on a measurement of an activation of the at least one hidden layer over a desired range of input values and using an error value between a desired output value and an actual output value to program the trainable weight with the externally calculated values.

15. The TAB of claim 14 in which, if the output weights and the output nodes are linear, then the computational device calculates the weights using the measurement of an activation of the at least one hidden layer over a desired range of input values and the desired output values for the range of desired input values without ever measuring the actual output value. This may be a good first step in determining the output weights.

16. The TAB of claim 15 in which the computational device employs an iterative process with a weight being calculated, programmed into the TAB and the process then repeated.

17. The TAB of any one of the preceding claims in which the input layer includes a plurality of input nodes and the at least one output node is connected to at least one of the input nodes to form a recurrent network.

18. A trainable analogue circuit block (TAB) which includes

an input layer defining at least one input node;

at least one hidden layer defining at least one hidden node;

19. The TAB of claim 18 in which the controllable offset is implemented as a distributed resistor element functioning as a voltage divider and generating different reference voltages to be input into at least one of the at least one non-linear weight and the at least one hidden node.

20. A trainable analogue circuit block (TAB) which includes

an input layer defining at least one input node;

at least one hidden layer defining at least one hidden node:

a circuit connecting the at least one hidden layer to the input layer, the non-linear circuit comprising at least one weight; and

an output layer defining at least one output node, the output layer being connected to the at least one hidden layer via a at least one trainable weight and training of the at least one trainable weight being implemented via a training rule which: - applies a training input signal;

generates an output signal;

21. The TAB of claim 20 in which the training rule increments or decrements the at least one trainable weight by an amount depending on a product of a sign of the error value between a desired value of the output signal and an actual value of the output signal and a sign of a hidden layer activation unit with the at least one weight being digitised and stored in a counter that counts up or down one unit depending on the product of the signs.

22. The TAB of claim 21 in which the weight update rule is configured to decrease a unit of weight change during training so that it takes larger steps at the beginning than at the end to speed up training time.

23. A method of training trainable weights of a trainable analogue circuit block (TAB), the method including

applying a training input signal;

generating an output signal;

24. The method of claim 23 which includes incrementing or decrementing a trainable weight by an amount depending on a product of a sign of the error value between a desired value of the output signal and an actual value of the output signal and a sign of a hidden layer activation unit and digitising and storing the weight in a counter that counts up or down one unit depending on the product of the signs.

25. The method of claim 24 which includes decreasing a unit of weight change during training so that it takes larger steps at the beginning than at the end to speed up training time.