Wabartha et al., 2023 - Google Patents
Piecewise linear parametrization of policies: Towards interpretable deep reinforcement learningWabartha et al., 2023
View PDF- Document ID
- 17760499277202009526
- Author
- Wabartha M
- Pineau J
- Publication year
- Publication venue
- The Twelfth International Conference on Learning Representations
External Links
Snippet
Learning inherently interpretable policies is a central challenge in the path to developing autonomous agents that humans can trust. We argue for the use of policies that are piecewise-linear. We carefully study to what extent they can retain the interpretableĀ ā¦
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
- G06N5/025—Extracting rules from data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
- G06N5/046—Forward inferencing, production systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
- G06N5/043—Distributed expert systems, blackboards
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06K9/6232—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
- G06K9/6247—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on an approximation criterion, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/004—Artificial life, i.e. computers simulating life
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6279—Classification techniques relating to the number of classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/18—Digital computers in general; Data processing equipment in general in which a programme is changed according to experience gained by the computer itself during a complete run; Learning machines
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Harb et al. | When waiting is not an option: Learning options with a deliberation cost | |
Wang et al. | Deep reinforcement learning: A survey | |
Eysenbach et al. | Robust predictable control | |
Le et al. | A deep hierarchical reinforcement learning algorithm in partially observable Markov decision processes | |
Fenjiro et al. | Deep reinforcement learning overview of the state of the art | |
Lang et al. | Exploration in relational domains for model-based reinforcement learning | |
Wickramasinghe et al. | Continual learning: A review of techniques, challenges, and future directions | |
Skowron et al. | Information systems in modeling interactive computations on granules | |
Mohan et al. | Structure in deep reinforcement learning: A survey and open problems | |
de Lope et al. | Response threshold models and stochastic learning automata for self-coordination of heterogeneous multi-task distribution in multi-robot systems | |
Wabartha et al. | Piecewise linear parametrization of policies: Towards interpretable deep reinforcement learning | |
Wang et al. | A language measure for performance evaluation of discrete-event supervisory control systems | |
Zhu et al. | Extracting decision tree from trained deep reinforcement learning in traffic signal control | |
Liu et al. | IWOA-RNN: An improved whale optimization algorithm with recurrent neural networks for traffic flow prediction | |
Skowron et al. | Toward interactive computations: A rough-granular approach | |
Klissarov et al. | Discovering Temporal Structure: An Overview of Hierarchical Reinforcement Learning | |
Mousavi et al. | Automatic abstraction controller in reinforcement learning agent via automata | |
Herath | Evolution and Advancements from Neural Network to Deep Learning | |
Haas | Tutorial: Artificial Neural Networks for Discrete-Event Simulation | |
Rohrer | An implemented architecture for feature creation and general reinforcement learning. | |
Bharti et al. | QL-SSA: An adaptive Q-learning based squirrel search algorithm for feature selection | |
Djurdjevic et al. | Deep belief network for modeling hierarchical reinforcement learning policies | |
Cohen et al. | Integrating distributed component-based systems through deep reinforcement learning | |
Koola et al. | How do we train a stone to think? A review of machine intelligence and its implications | |
Luttner | Training of Neural Networks with Uncertain Data: A Mixture of Experts Approach |