KR100429976B1

KR100429976B1 - Behavior learning heighten method of robot

Info

Publication number: KR100429976B1
Application number: KR10-2002-0005963A
Authority: KR
Inventors: 장성준
Original assignee: 엘지전자 주식회사
Priority date: 2002-02-01
Filing date: 2002-02-01
Publication date: 2004-05-03
Anticipated expiration: 2022-02-01
Also published as: KR20030065902A

Abstract

본 발명은 로봇의 행동학습방법에 관한 것으로, 네트워크를 통해 다른 로봇의 행동패턴이나 사용자에 의해 지정된 규칙사항을 받아 들여 자기의 행동패턴으로 학습하도록 한 것이다. 이를 위하여 본 발명은 네트워크를 통해 다른 로봇의 행동패턴을 받아 들이는 제1 과정과; 상기 제1 과정의 행동패턴과 연관되는 행동들에 대한 행동선택 확률을 임시로 높여 해당 행동을 발현하는 제2 과정과; 현재 발현한 행동에 대한 사용자의 칭찬이나 꾸중에 의해 사용자의 반응을 판별하여, 사용자의 반응이 우호적인지를 판단하는 제3 과정과; 상기 제3 과정의 판단결과, 사용자의 반응이 우호적이면 연관될 행동들에 대한 행동 선택 확률을 증가시키는 제4 과정으로 이루어진다.The present invention relates to a behavior learning method of a robot, which accepts behavior patterns of other robots or rules specified by a user through a network and learns the behavior patterns of the robot. To this end, the present invention includes a first process of receiving a behavior pattern of another robot through a network; A second process of temporarily increasing a behavior selection probability for the actions associated with the behavior pattern of the first process to express the corresponding behavior; A third step of determining whether the user's response is favorable by judging the user's response to the presently expressed behavior by the user's praise or scolding; As a result of the determination of the third process, if the user's response is favorable, the fourth process increases the probability of selecting a behavior for the related actions.

Description

BEHAVIOR LEARNING HEIGHTEN METHOD OF ROBOT}

본 발명은 로봇의 행동학습방법에 관한 것으로, 특히 네트워크를 통하여 다른 로봇의 행동양식을 추가로 학습하도록 한 로봇의 행동학습방법에 관한 것이다.The present invention relates to a behavior learning method of a robot, and more particularly, to a behavior learning method of a robot to further learn the behavior of another robot through a network.

일반적으로, 서비스 로봇 및 애완 로봇은 사용자의 피드백을 자각하여 특정 행동(Behavior)에 대한 단순 강화 학습을 수행한다.In general, service robots and pet robots perform simple reinforcement learning on specific behaviors by awaking user feedback.

즉, 로봇은, 머리나 목에 있는 터치 센서를 쓰다듬어 주거나, 긍정적인 뜻에 해당되는 음성단어가 인식되면 사용자의 칭찬으로 인식하고, 터치센서를 강하게 때리거나 부정적인 뜻에 해당되는 음성단어가 인식되면 꾸중으로 인식한다.That is, the robot recognizes the user's praise when stroking the touch sensor on the head or neck, or when a voice word corresponding to a positive meaning is recognized, and scolds when a strong hitting the touch sensor or a voice word corresponding to a negative meaning is recognized. To be recognized.

상술한 바와같이, 칭찬에 해당되는 피드백을 받은 경우는, 방금 수행했던 행동만의 차후 발현 비율을 높이는 방식으로 해당 행동을 강화시키고, 꾸중에 해당되는 피드백을 받은 경우는, 방금 수행했던 행동만의 차후 발현 비율을 낮추는 방식으로 해당 행동을 약화시킨다.As described above, when a feedback corresponding to a compliment is received, the behavior is strengthened by increasing a subsequent expression rate of only the behavior that was just performed, and when a feedback corresponding to a scolding is received, only the behavior that was just performed Subsequently, this behavior is attenuated by lowering the expression rate.

즉, 종래 로봇의 행동학습방법은, 특정행동에 대한 사용자의 피드백 정보가 들어올 경우에 해당되는 행동에만 강화를 시키는 방식을 취해 왔는데, 이러한 방식은 로봇을 단순히 사람의 말에 단순하게 복종하게 하므로, 단편적 학습만을 수행하는 수동적인 로봇을 만들고,In other words, the behavior learning method of the conventional robot has taken a method of reinforcing only the behavior corresponding to the user's feedback information about a specific behavior, and this method simply obeys the robot's words. Create a passive robot that only does fractional learning,

또한 사용자에 의해서만 학습이 이루어지기 때문에 학습정보의 양이 부족하며, 이로 인해 사용자가 로봇에 대하여 쉽게 지루함을 느끼게 되는 문제점이 있다.In addition, since the learning is performed only by the user, the amount of learning information is insufficient, which causes the user to easily feel bored with the robot.

본 발명은 상기와 같은 문제점을 해결하기 위하여 안출된 것으로, 네트워크를 통해 다른 로봇의 행동패턴이나 사용자에 의해 지정된 규칙사항을 받아 들여 자기의 행동패턴으로 학습하도록 한 로봇의 행동학습방법을 제공함에 그 목적이 있다.The present invention has been made to solve the above problems, to provide a behavior learning method of the robot to learn the behavior patterns of other robots through the network or the rules specified by the user to learn their behavior patterns. There is a purpose.

도1은 본 발명 로봇의 행동학습방법이 적용되는 장치의 구성을 보인 블록도.1 is a block diagram showing the configuration of a device to which the behavior learning method of the present invention is applied.

도2는 본 발명 로봇의 행동학습방법에 대한 동작흐름도.Figure 2 is a flow chart for the behavior learning method of the present invention robot.

도3은 도2에 있어서, 로봇의 행동학습방법에 대한 동작을 보인 개략도.Figure 3 is a schematic diagram showing the operation of the behavior learning method of the robot in Figure 2;

*****도면의 주요부분에 대한 부호의 설명********** Description of the symbols for the main parts of the drawings *****

110:움직임 제어부 111:디지탈신호처리부110: motion control unit 111: digital signal processing unit

112:PWM제너레이터 113-1~113-N:모터112: PWM generator 113-1 to 113-N: Motor

1~N:포텐셔미터 120:움직임제어부1 to N: Potentiometer 120: Motion control unit

121:마이크로프로세서 122:플래시메모리121: microprocessor 122: flash memory

상기와 같은 목적을 달성하기 위한 본 발명은,네트워크를 통해 다른 로봇의 행동패턴을 받아 들이는 제1 과정과; 상기 제1 과정의 행동패턴과 연관되는 행동들에 대한 행동선택 확률을 임시로 높여 해당 행동을 발현하는 제2 과정과; 현재 발현한 행동에 대한 사용자의 칭찬이나 꾸중에 의해 사용자의 반응을 판별하여, 사용자의 반응이 우호적인지를 판단하는 제3 과정과; 상기 제3 과정의 판단결과, 사용자의 반응이 우호적이면 연관될 행동들에 대한 행동 선택 확률을 증가시키는 제4 과정으로 수행함을 특징으로 한다.The present invention for achieving the above object, the first process of accepting the behavior pattern of the other robot through the network; A second process of temporarily increasing a behavior selection probability for the actions associated with the behavior pattern of the first process to express the corresponding behavior; A third step of determining whether the user's response is favorable by judging the user's response to the presently expressed behavior by the user's praise or scolding; As a result of the determination of the third process, if the user's response is favorable, the fourth process may increase the action selection probability for the actions to be related.

이하, 본 발명에 의한 로봇의 행동학습방법에 대한 작용 및 효과를 첨부한 도면을 참조하여 상세히 설명한다.Hereinafter, with reference to the accompanying drawings, the action and effect of the behavior learning method of the robot according to the present invention will be described in detail.

도1은 본 발명 로봇의 행동학습방법이 적용되는 장치의 구성을 보인 블록도로서, 이에 도시한 바와같이 무선랜을 통해 다른 로봇의 행동패턴이나 사용자의 학습규칙을 입력받아 이를 피지컬링크를 통해 움직임제어부로 전송하고, 그 움직임제어부(120)의 명령신호에 따라 모터(113-1~113-N)의 구동을 제어하는 모터제어부 (110)와; 상기 모터제어부(110)로부터 입력되는 다른 로봇의 행동패턴이나 사용자의 학습규칙을 발현한후, 이를 터치 판넬 및 마이크를 통해 입력되는 외부자극에의해, 로봇의 움직임 패턴으로 결정하여 그에 따른 명령신호를 상기 모터제어부 (110)에 전송하는 움직임제어부(120)로 구성한다.Figure 1 is a block diagram showing the configuration of the device to which the behavior learning method of the present invention is applied, as shown in the movement pattern of other robots or user's learning rules through a wireless LAN as shown in the movement through the physical link A motor controller 110 which transmits to the controller and controls the driving of the motors 113-1 to 113-N according to the command signal of the motion controller 120; After expressing the behavior pattern of the other robot or the learning rules of the user inputted from the motor control unit 110, this is determined by the movement pattern of the robot by an external stimulus input through the touch panel and the microphone and the command signal corresponding thereto. It consists of a motion control unit 120 to transmit to the motor control unit 110.

상기 모터제어부(110)는, 다른 로봇의 행동패턴이나 사용자의 지시사항들을 받아 들이기 위한 무선랜(미도시)과; 외부의 자극을 센싱하는 센서(114)와; 외부 이미지를 씨씨디(115)를 통해 캡쳐하는 이미지캡쳐(116)와; 상기 센서(114)의 자극 및 이미지 갭쳐(116)의 이미지를 입력받아 이를 피지컬 링크를 통해 상기 움직임 제어부 (120)로 전송하고, 그 움직임 전송부(120)에서 전송되는 특정동작에 대한 명령신호를 디지탈 신호처리하여 그에 따른 모터구동제어신호를 출력하는 디지탈신호처리부 (111)와; 상기 디지탈신호처리부(111)의 모터구동제어신호를 입력받아 그 모터구동제어신호에 해당되는 동작속도를 생성하기 위한 펄스폭변조신호를 출력함과 아울러 상기 모터구동제어신호에 해당되는 특정동작의 위치를 지시하기 위한 위치신호를 출력하는 PWM제너레이터(112)와; 상기 PWM제너레이터(112)의 펄스폭변조신호 및 위치신호에 따라, 각기 해당 링크를 동작시키는 다수의 모터(113-1~113-N)와; 상기 링크를 통해 현재 관절의 위치를 파악하여, 명령에 따른 동작의 오류를 검출하여 그 검출된 오류를 재조정하는 포텐셔미터로 구성된다.The motor controller 110 may include a wireless LAN (not shown) for receiving behavior patterns of other robots or user instructions; A sensor 114 for sensing an external stimulus; An image capture 116 for capturing an external image through the CD 115; The sensor 114 receives an image of the stimulus and the image gap 116 and transmits the image to the motion controller 120 through the physical link, and sends a command signal for a specific operation transmitted from the motion transmitter 120. A digital signal processor 111 for digital signal processing and outputting a motor driving control signal accordingly; Receives the motor drive control signal of the digital signal processor 111 and outputs a pulse width modulated signal for generating an operation speed corresponding to the motor drive control signal, and the position of a specific operation corresponding to the motor drive control signal. PWM generator 112 for outputting a position signal for indicating a; A plurality of motors 113-1 to 113-N for operating the respective links according to the pulse width modulation signal and the position signal of the PWM generator 112; The position of the current joint through the link is detected, the error of the motion according to the command is detected and the potentiometer for readjusting the detected error.

상기 움직임제어부(120)는, 마이크(125)를 통해 음성신호를 입력받아 이를 코딩하는 오디오코덱부(124)와; 운용프로그램 및 응용프로그램이 저장되는 플래시메모리(122)와; 엘씨디 판넬과 일체형으로 이루어져, 외부의 자극을 입력받은 터치 판넬(123)과; 상기 오디오코덱부(124)와 터치판넬(123)을 통해 입력되는 음성신호와 외부자극을 저장하는 램(127)과; 상기 오디오코덱부(124) 및 터치판넬(123)을통해 입력되는 외부자극과 음성신호를 소정 신호처리하여 그에 따라 로봇의 행동을 결정한후, 그 로봇의 행동에 대한 명령신호를 피지컬 링크를 통해 모터제어부(110)에 전송하는 마이크로프로세서(121)로 구성한다.The motion control unit 120 includes: an audio codec unit 124 that receives a voice signal through a microphone 125 and codes the same; A flash memory 122 for storing an operating program and an application program; A touch panel 123 formed integrally with the LCD panel and receiving an external magnetic pole; A RAM 127 for storing voice signals and external stimuli input through the audio codec unit 124 and the touch panel 123; After processing the external stimulus and voice signal input through the audio codec unit 124 and the touch panel 123 to determine the behavior of the robot according to the predetermined signal, the command signal for the behavior of the robot through the physical link motor It consists of a microprocessor 121 for transmitting to the control unit 110.

상기 피지컬 링크는 I/O버스나 USB, 또는 RS232-C 케이블로 이루어진다.The physical link consists of an I / O bus, USB, or RS232-C cable.

도2는 로봇의 행동학습방법에 대한 동작흐름도로서, 이에 도시한 바와같이 네트워크를 통해 다른 로봇의 행동패턴을 받아 들이는 제1 과정과; 상기 제1 과정의 행동패턴과 연관되는 행동들에 대한 행동선택 확률을 임시로 높여 해당 행동을 발현하는 제2 과정과; 사용자의 반응이 우호적인지를 판단하는 제3 과정과; 상기 제3 과정의 판단결과, 사용자의 반응이 우호적이면 연관될 행동들에 대한 행동 선택 확률을 증가시키는 제4 과정으로 이루어지며, 이와같은 본 발명의 동작을 설명한다.FIG. 2 is a flow chart illustrating a behavior learning method of a robot, and a first process of receiving a behavior pattern of another robot through a network as shown in the drawing; A second process of temporarily increasing a behavior selection probability for the actions associated with the behavior pattern of the first process to express the corresponding behavior; A third step of determining whether the user's response is favorable; As a result of the determination of the third step, if the user's response is favorable, the fourth step of increasing the action selection probability for the actions to be related is described.

먼저, 로봇은 내부 상태에 따라, 행동을 선택할 수 있는 행동 선택 과정 (Behavior Selection)을 갖는데, 상기 내부상태는 노여움,즐거움,놀라움등의 감정 (Emotion)과 식욕,성욕등의 욕구(Motivation)으로 이루어진다.First, the robot has a behavior selection process (Behavior Selection) that can select the action according to the internal state, the internal state is the emotion (emotion) such as anger, pleasure, surprise, and the desire (Motivation) such as appetite, sexual desire Is done.

상기 행동 선택 과정은, 마이크로프로세서에서 비주기적으로 이루어지고, 선택된 로봇의 동작정보는 로봇의 정지자세를 특정한 시간 간격으로 나타낸 것으로 근사하는데, 이는 정화상을 연속으로 재생하여 동화상을 구현하는 영화나 동화 (Animation)의 원리와 같다.The action selection process is performed aperiodically by a microprocessor, and the motion information of the selected robot is approximated by representing the stationary attitude of the robot at a specific time interval. Same as the principle of (Animation).

단, 영화가 화상정보로 표현되듯이 로봇의 정지자세는 로봇이 가지고 있는 모든 관절(Joint)의 현재 지시값으로 나타내는데, 회전관절은 현재 지시값이 각도이고, 직동관절은 현재 지시값이 변위가 된다.However, as the movie is represented by the image information, the robot's stationary position is represented by the current indication value of all joints of the robot. In the case of the rotation joint, the current indication value is an angle, and in the linear joint, the current indication value is displaced. do.

이때, 로봇의 정지자세를 나타내는 모든 관절의 지시값을 프레임(Frame)으로 정의하고, 프레임을 시계열(Time Series)로 작성한 것을 동작정보로 정의한다.At this time, the instruction values of all the joints representing the robot's stationary position are defined as a frame, and a frame created in a time series is defined as motion information.

여기서, 본 발명의 동작을 설명하면, 우선, 현재의 기계적 장치로 실행가능한 로봇의 행동들을 분석한후, 각 행동의 연결 비율을 기설정하여 Commom-Sense Stereotype DB를 구현하는데, 그 Commom-Sense Stereotype DB는 임의의 행동간의 연결비율을 조정하기 위해, 특정 항목을 수정하거나 추가한다.Here, when the operation of the present invention is described, first, after analyzing the actions of the robot executable by the current mechanical device, and implements the Commom-Sense Stereotype DB by setting the connection ratio of each action, the Commom-Sense Stereotype The database modifies or adds specific items to adjust the connection ratio between arbitrary actions.

여기서, 상기 Commom-Sense Stereotype DB는, 무선랜이나 RS232C케이블로 피씨로 연결되어, 사용자가 행동간의 연결비율을 조정하기 위해 내용을 교체한다.Here, the Commom-Sense Stereotype DB is connected to the PC by a wireless LAN or an RS232C cable, and the user replaces contents to adjust the connection ratio between actions.

예를 들어, 로봇의 행동중에 짖는 행동이나 땅을 파는 행동등은 강아지와 유사한 행동으로 묶을 수 있고, 경례하는 행동이나 부동자세등은 군인과 유사한 행동으로 묶어 놓을 수 있다.For example, barking or digging in a robot's actions can be tied to dog-like behavior, and salute or floating posture can be tied to soldier-like behavior.

이때, 로봇은 네트워크를 통해 다른 로봇의 행동패턴을 받아들여 그 행동패턴과 연관되는 행동들에 대한 행동 선택 확률을 높여 해당 행동을 발현한다.At this time, the robot accepts the behavior patterns of other robots through the network and increases the probability of selecting a behavior for the behaviors associated with the behavior patterns to express the corresponding behavior.

즉, 무선랜을 통해 다른 로봇의 행동패턴을 받아들여 이를 참조하여 로봇의 행동 선택 확률을 조정함으로써, 다른 로봇의 행동패턴을 흉내낸다.In other words, by accepting the behavior pattern of the other robot through the wireless LAN by referring to this to adjust the behavior selection probability of the robot, to mimic the behavior pattern of the other robot.

그 다음, 상기 다른 로봇의 행동패턴을 사용자가 호의적으로 받아들이는지를 판단하여 자신의 행동패턴으로 결정할지를 판단한다.Then, it is determined whether the user accepts the behavioral pattern of the other robot in a favorable manner and decides to determine it as his own behavioral pattern.

즉, 사용자에 의해, 피드백 외부입력이 있으면, 그 외부 입력이 사용자의 칭찬 또는 꾸중인지를 판별하여 사용자의 호감도를 판단한다.In other words, if the user inputs a feedback external input, the user's favorability is determined by determining whether the external input is praised or made by the user.

이때, 터치센서를 통한 사용자의 입력으로 칭찬 또는 꾸중을 판단하는데, 사용자에 의해 피드백 받는 터치센서가 어느 부위에 있는지로 칭찬인지 꾸중인지를 구별하거나, 사용자에 의한 터치스크린의 눌림 지속시간에 따라 칭찬인지 꾸중인지를 구별한다.At this time, the user's input through the touch sensor is used to determine the compliment or scolding, which part of the touch sensor received feedback from the user is complimented or scolded, or complimented according to the duration of the touch screen pressed by the user. Distinguish between cognition and scolding

상기 칭찬 또는 꾸중을 판단하는 다른 방법으로, 칭찬과 꾸중에 해당되는 단어를 저장한 단어DB와 사용자의 음성입력을 비교하여 꾸중 또는 칭찬을 판단한다.As another method of determining the praise or scolding, the scrutiny or compliment is judged by comparing the user's voice input with the word DB storing the words corresponding to the compliment and scolding.

만약, 피드백된 외부의 입력이 사용자의 칭찬 또는 꾸중이면, 해당 행동 선택 확률을 증가 또는 감소시킨다.If the inputted external feedback is praised or scolded by the user, the corresponding action selection probability is increased or decreased.

즉, 피드백된 외부의 입력이 사용자의 칭찬이면, 현재 저장되어 있는 관절 지시값의 시계열 데이터를 이용하여 현재 로봇 행동을 인식한후, 그 인식된 로봇행동의 발현 확률을 특정값 만큼 증가시켜 다른 로봇의 행동패턴을 자신의 행동패턴으로 학습하고, 피드백된 외부의 입력이 사용자의 꾸중이면, 현재 저장되어 있는 관절 지시값의 시계열 데이터를 이용하여 현재 로봇 행동을 인식한후, 그 인식된 로봇 행동의 발현 확률을 특정값 만큼 감소시킨다.In other words, if the feedback input is praised by the user, the current robot behavior is recognized using the time series data of the currently stored joint indication value, and then the probability of expression of the recognized robot behavior is increased by a specific value so that other robots can recognize it. Learn the behavior pattern of the user's own behavior pattern, and if the feedback input is scolded by the user, it recognizes the current robot behavior by using time series data of the currently stored joint indication value, and then Decrease the expression probability by a certain value.

만약, 사용자가 정한 규칙 사항이 무선랜을 통해 받아 들여지면 상술한 방법으로 행동 선택 확률값을 조정하여 사용자의 규칙사항에 따른 행동을 보여주고, 이에 대해 사용자가 만족하는지를 칭찬 또는 꾸중으로 인식하여 자신의 행동패턴으로 받아들일지 결정한다.If the rules set by the user are accepted through the WLAN, the behavior selection probability value is adjusted in the above-described manner to show the behavior according to the rules of the user. Decide whether to accept it as a behavioral pattern.

여기서, 상기 다른 로봇의 행동패턴과 연관된 행동들이 있으면, 그 연관된 행동들에 대한 선택 확률을 증가 또는 감소시키는데, 칭찬과 연관된 행동의 경우에는 발생 확률을 증가시키고, 꾸중과 연관된 행동의 경우에는 발생 확률을 감소시킨다.Here, if there are behaviors associated with the behavior patterns of the other robots, the probability of selection for the associated behaviors is increased or decreased, in the case of behaviors associated with praise, the probability of occurrence increases, and in the case of behaviors associated with scolding. Decreases.

보다 상세하게 도3을 참조하여 설명하면, 행동선택과정은, Emotion Modeling부의 상태에 따라 각 동작들의 발현 확률을 설정하고, 그 설정된 발현확률에 따라 수행할 동작을 결정하는데, 상기 Emotion Modeling부는 외부 입력과 로봇의 행동 수행상황을 종합하여 지속적으로 갱신된다.Referring to FIG. 3 in detail, the action selection process sets an expression probability of each operation according to the state of the Emotion Modeling unit, and determines an operation to be performed according to the set expression probability, wherein the Emotion Modeling unit inputs an external input. It is continuously updated by synthesizing the performance status of robots and robots.

이렇게, 로봇이 정상적으로 동작하고 있을 때, 터치센서 또는 마이크를 통해 외부의 자극이 입력되면 이를 램에 저장하고, 피드백 프로세싱부는 그 입력이 사람의 칭찬 또는 꾸중인지를 판단하는데, 만약, 사람의 칭찬 또는 꾸중이 아니라고 판단되면 램에 저장된 외부입력은 사용되지 않고, 계속하여 행동 선택과정을 수행한다.As such, when the robot is operating normally, when an external stimulus is input through a touch sensor or a microphone, the external stimulus is stored in the RAM, and the feedback processing unit judges whether the input is a person's praise or adoration. If it is judged not to be scolded, the external input stored in the RAM is not used and the action selection process continues.

반대로, 칭찬 또는 꾸중이라고 판단되면, 기저장되어 있는 관절 지시값의 시계열 데이터를 이용하여 현재 로봇의 행동이 무엇인지를 알아내어 그 행동의 발현확률을 칭찬 또는 꾸중에 맞게 특정값 만큼 증가 또는 감소시킨후, 그 행동의 발현확률을 저장한다.On the contrary, if it is judged to be praised or scolded, the time series data of the pre-stored joint indication value is used to find out what the current robot's behavior is and increase or decrease the expression probability of the behavior by a specific value according to the praise or scolding. Then, the probability of expression of the action is stored.

이후, 마이크로프로세서는 칭찬 또는 꾸중의 피드백을 받은 로봇의 해당 행동과 연관된 행동들이 있는지를 Commom-Sense Stereotype DB를 이용하여 판단하는데, 연관된 행동들이 있을 경우에는 해당 행동의 차후 발현 확률을 칭찬 또는 꾸중에 맞게 증가 또는 감소시켜 저장한후, 상기 행동선택과정을 지속적으로 계속하는 자율모드로 복귀한다.The microprocessor then uses Commom-Sense Stereotype DB to determine if there are behaviors associated with the behavior of the robot that received compliment or scolding feedback. After increasing or decreasing as appropriate, it returns to the autonomous mode which continues the action selection process continuously.

한편, 무선랜을 통해 다른 로봇의 행동패턴이 받아들여지면, 해당 행동의 발현 확률을 임시로 조정한다.On the other hand, if the behavior pattern of another robot is accepted through the WLAN, temporarily adjust the expression probability of the behavior.

예를 들어, 현재 로봇의 행동 패턴은 장애물을 만났을때 무조건 왼쪽으로 피하는 방식이고, 새롭게 무선랜을 통해 받아들여진 행동패턴은 장애물을 만났을때, 왼쪽,오른쪽을 둘러보고 장애물이 없는 쪽으로 피하는 방식이라고 가정하며, 로봇은 무선랜을 통해 들어온 방식인 좌우를 둘러보고 회피 방향을 정하는 일련의 행동 발현 확률을 임의로 높인다.For example, it is assumed that the current behavior pattern of the robot is to avoid the left side unconditionally when encountering an obstacle, and the newly adopted behavior pattern through the WLAN is to look around the left and right side and to avoid the obstacle-free side. In addition, the robot randomly increases the probability of a series of behavioral expressions by looking around the left and right, which is a method entered through the WLAN, and determining the avoidance direction.

이렇게 높아진 확률에 의해서 새로운 방식이 선택되어 실행되었을 때, 이를 사용자가 칭찬해 주면 앞으로 장애물을 발견했을때는 좌우를 둘러보고 방향을 선택하는 방식의 확률을 확정적으로 높여주게 된다.When the new method is selected and executed by the increased probability, the user praises it, and when the obstacle is found in the future, the probability of looking around the left and right and selecting the direction is decidedly increased.

또한,사용자가 PC를 이용해 장애물이 있을 때는 360도 회전하면서 장애물이 없는 곳을 찾아서 그 방향으로 전진하라는 규칙을 만들어 전송하면, 상술한 방식과 같이 임의로 상기 규칙에 대한 행동발현 확률을 상향 조정하여 사용자의 반응을 파악한 다음에 해당 행동의 차후 확률을 결정하게 된다.In addition, when the user makes a rule to rotate 360 degrees when there is an obstacle using the PC to find the place without the obstacle and move forward in that direction, the user can arbitrarily adjust the behavior expression probability for the rule as described above. The response is then determined and the subsequent probability of the action is determined.

상기 본 발명의 상세한 설명에서 행해진 구체적인 실시 양태 또는 실시예는 어디까지나 본 발명의 기술 내용을 명확하게 하기 위한 것으로 이러한 구체적 실시예에 한정해서 협의로 해석해서는 안되며, 본 발명의 정신과 다음에 기재된 특허 청구의 범위내에서 여러가지 변경 실시가 가능한 것이다.The specific embodiments or examples made in the detailed description of the present invention are intended to clarify the technical contents of the present invention to the extent that they should not be construed as limited to these specific embodiments and should not be construed in consultation. Various changes can be made within the scope of.

이상에서 상세히 설명한 바와같이 본 발명은, 네트워크를 통해 다른 로봇의행동패턴이나 사용자에 의해 지정된 규칙사항을 받아들여 이를 자신의 행동패턴으로 학습함으로써, 다양한 행동을 쉽게 습득하는 효과가 있다.As described in detail above, the present invention has an effect of easily acquiring various behaviors by accepting behavior patterns of other robots or rules specified by a user and learning them with their own behavior patterns through a network.

Claims

A first process of receiving a behavior pattern of another robot through a network;

A second process of temporarily increasing a behavior selection probability for the actions associated with the behavior pattern of the first process to express the corresponding behavior;

A third step of determining whether the user's response is favorable by judging the user's response to the presently expressed behavior by the user's praise or scolding;

And as a result of the determination of the third process, if the user's response is favorable, the behavior learning method of the robot according to claim 4, wherein the behavior is performed as a fourth process of increasing the probability of selecting a behavior.

The method of claim 1, wherein the first process comprises:

Behavioral learning method of the robot further comprising the step of accepting the user's rules over the network.

delete

The method of claim 1, wherein praise and scolding

Behavior learning method of the robot, characterized in that determined by the user input through the touch sensor.

The method of claim 4, wherein the robot is distinguished by praise or congratulation of which part of the touch sensor is fed back.

[5] The method of claim 4, wherein the robot distinguishes between praise and condolence according to the duration of the pressing of the touch screen by the user.

The method of claim 1, wherein praise and scolding

The behavior learning method of the robot, characterized by comparing the word DB storing the words corresponding to the praise and scolding with the user's voice input to determine the scolding or praise.

The method of claim 1, wherein the third process comprises:

Recognizing the current robot behavior using the time series data of the joint indication value stored in the present case, and increasing the expression probability of the recognized robot behavior by a specific value. Behavior Learning Method.

The method of claim 1, wherein the third process comprises:

Recognizing the current robot behavior using time series data of the currently stored joint indication value, and then reducing the expression probability of the recognized robot behavior by a specific value. Behavior Learning Method.