Plaat, 2020 - Google Patents
Self-PlayPlaat, 2020
- Document ID
- 1087011932165032891
- Author
- Plaat A
- Publication year
- Publication venue
- Learning to Play: Reinforcement Learning and Games
External Links
Snippet
Self-Play Page 1 Chapter 7 Self-Play This chapter is devoted to AlphaGo-style self-play. Self-play
is an intuitively appealing AI method that has long been used by AI researchers in various
forms, as we saw at the end of the previous chapter. The 2016 results showed, many years …
- 230000002787 reinforcement 0 description 70
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
- G06N5/046—Forward inferencing, production systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Hu et al. | A survey on large language model-based game agents | |
| Hilpisch | Artificial intelligence in finance | |
| US7873587B2 (en) | Method and system for creating a program to preform a desired task based on programs learned from other tasks | |
| Devezas et al. | Power law behavior and world system evolution: A millennial learning process | |
| Van Otterlo | The logic of adaptive behavior: Knowledge representation and algorithms for adaptive sequential decision making under uncertainty in first-order and relational domains | |
| Liu et al. | On efficient reinforcement learning for full-length game of starcraft ii | |
| Baier et al. | Emulating human play in a leading mobile card game | |
| Wu et al. | Learning to play Go using recursive neural networks | |
| Baum | Manifesto for an evolutionary economics of intelligence | |
| Cox et al. | Predicting the next response: Demonstrating the utility of integrating artificial intelligence-based reinforcement learning with behavior science | |
| Plaat | Self-Play | |
| Hu | Planning with a model: Alphazero | |
| Seify | Single-agent optimization with monte-carlo tree search and deep reinforcement learning | |
| Ruotsalainen | Comparing path-finding algorithms and machine learning model | |
| Iosti et al. | Synthesizing control for a system with black box environment, based on deep learning | |
| Plaat | Two-Agent Self-Play | |
| Araújo | Agentes Com Aprendizagem Automática Para Jogos de Computador | |
| Dobre | Low-resource learning in complex games | |
| West | Self-play deep learning for games: Maximising experiences | |
| Präntare | Simultaneous coalition formation and task assignment in a real-time strategy game | |
| Rizzo | Model-Free Multi-Agent Reinforcement Learning Approach in NeurIPS LuxAI S3 Competition | |
| Yannakakis et al. | AI Methods for Games | |
| Stooke | Advancements in Deep Reinforcement Learning: Algorithms and Implementations | |
| de Oliveira | A Modular Architecture for Model-Based Deep Reinforcement Learning | |
| Ring et al. | Replicating deepmind starcraft ii reinforcement learning benchmark with actor-critic methods |