KR20060081966A

KR20060081966A - Stereotactic Positioning System for Teleconferencing

Info

Publication number: KR20060081966A
Application number: KR1020050002655A
Authority: KR
Inventors: 남시욱; 류대하; 오상근
Original assignee: 엘지전자 주식회사
Priority date: 2005-01-11
Filing date: 2005-01-11
Publication date: 2006-07-14

Abstract

본 발명은 효율적인 다자간 온라인 원격회의, 원격 화상회의를 위한 음상(Sound Image) 정위 시스템에 관한 것이다.The present invention relates to a sound image positioning system for efficient multi-party online teleconferencing and teleconference.

본 발명은 원격회의 참석자들로부터의 음성신호를 입력받는 입력부, 가상의 회의장 화면에서 참석자들에 대응하는 방향으로 각각의 참석자들의 음성신호를 음상 정위하는 처리부, 그리고 스테레오 헤드폰/스피커로 출력하기 위해서 각 신호들을 합하여 주는 부분인 출력부로 구성된다. 본 발명은 참석자들 상호간에 이루어지는 대화에 해당하는 음성/음향 신호들을 가상의 공간상에 임의의 방향을 가지도록 넓게 배치시키는 기법으로서, 회의중에 발언자의 음성 신호들이 불규칙적으로 섞이지 않게 함으로써, 사용자가 화자 구분된 형태로 참석자들이 진행하는 전체 회의를 청취할 수 있게 하며, 사용자 대 참석자의 주(main) 대화 도중에도 다른 참석자들의 발언 및 다른 참석자들 간의 대화 내용을 주 대화에 대한 방해없이 동시에 자연스럽게 청취하는 것이 가능하도록 하는 기법을 제공한다.The present invention provides an input unit for receiving a voice signal from a participant in a teleconference, a processor for sound-positioning the voice signal of each participant in a direction corresponding to the participant on a virtual conference room screen, and a stereo headphone / speaker. It is composed of an output part that adds up the signals. The present invention is a technique for widely distributing voice / sound signals corresponding to a conversation between participants in a virtual space in an arbitrary direction, so that the speaker's voice signals are not randomly mixed during the meeting, so that the user speaks. Allows attendees to listen to the entire conference in a separate form, while simultaneously listening to other attendees' comments and conversations between other attendees, naturally, without interrupting the main conversation, even during the user-to-attendee main conversation. Provide a technique to enable this.

원격회의,원격화상회의,음상,음상정위Teleconference, Remote Video Conferencing, Audio, Audio

Description

SOUND IMAGE LOCATION SYSTEM IN TELECONFERENCE SYSTEM}

도1은 본 발명의 원격회의를 위한 음상 정위 시스템 구성을 나타낸 도면1 is a diagram showing the configuration of a sound image positioning system for teleconference of the present invention;

도2는 본 발명이 적용되는 가상 회의장 화면의 예를 나타낸 도면2 is a diagram illustrating an example of a virtual conference hall screen to which the present invention is applied.

<도면의 주요 부분에 대한 부호의 설명><Explanation of symbols for the main parts of the drawings>

10: 입력부 20: 음상 정위 처리부10: input unit 20: sound image orientation processing unit

30: 출력부 40: 시스템 제어부30: output unit 40: system control unit

50: 사용자 인터페이스부 60: 디스플레이부50: user interface unit 60: display unit

본 발명은 효율적인 다자간 온라인 원격회의, 원격 화상회의를 위한 음상(Sound Image) 정위 시스템에 관한 것이다. 특히, 본 발명은 참석자들 상호간에 이루어지는 대화에 해당하는 음성/음향 신호들을 가상의 공간상에 임의의 방향을 가지도록 넓게 배치시키는 기법에 관한 것으로서, 이러한 신호들이 불규칙적으로 섞이지 않게 함으로써, 사용자가 화자 구분된 형태로 참석자들이 진행하는 전체 회의를 청취할 수 있게 하며, 사용자 대 참석자의 주(main) 대화 도중에도 다른 참석자 들의 발언 및 다른 참석자들 간의 대화 내용을 주 대화에 대한 방해없이 동시에 자연스럽게 청취하는 것이 가능하도록 하는 기법이다.The present invention relates to a sound image positioning system for efficient multi-party online teleconferencing and teleconference. In particular, the present invention relates to a technique for widely distributing voice / sound signals corresponding to a conversation between participants in a virtual space in an arbitrary direction so that the user does not mix the signals irregularly. Allows attendees to listen to the entire conference in a separate form, while simultaneously listening to other attendees' comments and conversations between other attendees, naturally, without interrupting the main conversation, even during the user-to-attendee main conversation. Is a technique that makes it possible.

원격회의(teleconference)는 원격 통신을 통해 다른 장소에 있는 사람들과 한 자리에 있는 것처럼 회의를 할 수 있는 정보통신 서비스이다. 이 것은 이동 시간의 낭비, 의사 결정 지연 등의 장애 요인을 제거함으로써 참석자로 하여금 생산성을 높이고 효율적으로 시간을 사용할 수 있도록 한다.Teleconference is an information and communication service that allows you to have a meeting through remote communication as if you were in the same place with people in other places. This eliminates barriers such as wasting travel time and delaying decision making, enabling attendees to be more productive and to use their time efficiently.

또한 원격회의 시스템은 회의 도중에 자료 교환을 쉽게 할 수 있을 뿐만 아니라 자리 이동없이 복수의 사람들을 호출하여 다자간 회의를 실행할 수 있다.In addition, teleconferencing systems not only facilitate data exchange during meetings, but also enable multi-party conferences by calling multiple people without a shift.

이러한 원격회의는 근래에 통신 등의 제반 서비스 시스템의 발달로 인하여, 참석자의 영상을 보면서 회의를 가능하게 하는 다자간 화상회의 시스템으로 발전하고 있는데, 다자간 화상 회의 시스템이 원활하게 수행되기 위해서는 참석자에 관한 정보와 영상정보 및, 음성정보의 원활한 전송이 필요하다.Recently, the teleconference has been developed into a multi-party video conferencing system that enables meetings while watching the video of the attendees due to the development of various service systems such as communication. And smooth transmission of video information and audio information is required.

이들 가운데 음성정보는 영상정보 및 참석자에 관한 정보에 비하여 상대적으로 변화가 많으며, 사용자를 포함하여 3인 이상이 참석하는 회의에서 참석자들로부터 시시각각 발생하는 음성 정보들은 사용자에게 화자가 잘 구분된 형태로 청취되지 못할 수도 있다.Among them, the voice information is relatively changed compared to the video information and the information on the attendees, and the voice information generated from the attendees at the meeting of three or more people including the user is divided into speakers well-known to the user. You may not be able to listen.

즉, 3인 이상이 참석하는 회의에서 각각의 참석자들이 발생시키는 음성신호는 사용자의 입장에서 본다면 2차원적인 커뮤니케이션에 불과하기 때문에 어떤 화자가 발언을 행하였는지를 직시하기가 매우 어렵다. 예를 들면, 참석자 자신 A를 포함하여 참석자B, C가 다자간 원격 회의에 참석하였다고 가정하자. 이러한 경우 통신 및 A/V 인터페이스 수단을 기반으로 하여 화상 회의가 진행되기 때문에, 이는 실제로 3차원 공간(실 공간)상에서 참석자A,B,C가 서로 얼굴을 맞대고 의견을 교환하고 발언하며, 발언자의 음성을 청취하는 것과는 현격한 차이를 갖게 되는 것이다. 실 공간상에서의 회의라고 한다면 참석자A는 다른 참석자B,C의 위치가 물리적인 공간상에서 확실하게 구분 및 인식된 상태에 놓이게 되고, 따라서 참석자B가 발언을 하면 참석자A는 B가 발언하고 있다는 것을 자신과 그 발언자 사이의 물리적인 위치 관계를 토대로 하여 인식할 수 있지만, 가상 공간인 원격 회의에서는 이러한 물리적인 인식이 배제된 상태이기 때문에, 어떤 참석자가 발언하였는가를 3차원적인 관점과 2차원 적인 가상의 감각 상에서 인지한다는 것이 거의 불가능한 것이다.That is, it is very difficult to face which speaker speaks because the voice signal generated by each participant in a conference of three or more persons is only two-dimensional communication from the user's point of view. For example, suppose that attendees B and C, including attendees themselves A, attended a teleconference. In this case, since video conference is conducted based on communication and A / V interface means, participants A, B, and C face to face, exchange opinions and speak in a three-dimensional space (real space). Listening to voice is a big difference. In a real-world meeting, attendee A is in a state where the other attendees B and C are clearly identified and recognized in the physical space, so when attendee B speaks, attendee A is confident that B is speaking. It can be recognized based on the physical positional relationship between the speaker and the speaker, but in the teleconference, which is a virtual space, such physical recognition is excluded. It is almost impossible to perceive in the senses.

본 발명은 원격회의 시스템에서, 사용자가 느끼게 되는 참석자들에 대한 음상(sound image)을 참석자별로 서로 다른 위치에 정위치시킴으로써, 사용자로 하여금 화자(참석자)가 잘 구분된 상태로 전체적인 회의 내용이 청취될 수 있도록 하는 원격회의를 위한 음상 정위 시스템을 제공하는데 그 목적이 있다.According to the present invention, in a teleconference system, a sound image of a participant who is felt by a user is positioned at a different location for each participant, thereby allowing the user to listen to the entire conference content with the speaker (attendee) well separated. The objective is to provide a stereophonic system for teleconferencing.

본 발명은 원격회의 시스템에서, 시스템을 사용하고 있는 사용자가 귀를 통해 받아들인 다음에 느끼게 되는 참석자들의 음성 및 음향에 대한 음상을 각각의 참석자별로 서로 다른 가상 공간 상의 위치에 정위치시킴으로써, 참석자들로부터의 발언들이 불규칙적으로 뒤섞인 채로 청취되는 것을 방지하고, 사용자 대 참석자 1인의 주(main) 대화 중에 다른 참석자들의 발언 및 다른 참석자들 간의 대화 내용을 주 대화에 대한 방해없이, 동시에 자연스럽게 청취할 수 있도록 한 원격회의를 위한 음상 정위 시스템을 제공하는데 그 목적이 있다.According to the present invention, a participant is placed in a different virtual space for each participant by placing a sound image of the participant's voice and sound that is felt after a user using the system receives through the ear. To prevent speeches from being scrambled to be heard in an irregular mix and to listen naturally to other participants' speeches and conversations between other participants during the main conversation of the user versus one participant, at the same time, without disturbing the main conversation. The objective is to provide a stereophonic system for a teleconference.

상기 목적을 달성하기 위한 본 발명의 원격회의를 위한 음상 정위 시스템은, 원격회의 참석자들의 음성/음향신호를 입력받아 처리하기 위한 입력신호 처리수단과, 상기 입력된 참석자들 각각의 음성/음향신호를 참석자들 각각에 대하여 서로 다른 가상공간 상에 배치시키기 위한 음상 정위 처리수단과, 상기 음상 정위 처리된 참석자들 각각에 대한 음성/음향신호를 사용자에게 출력하기 위한 출력신호 처리수단을 포함하여 구성된 것을 특징으로 한다.According to an aspect of the present invention, an audio image positioning system for a teleconference includes: an input signal processing means for receiving and processing a voice / sound signal of a participant of a remote conference; and a voice / sound signal of each of the input participant. Sound location processing means for arranging each participant in a different virtual space, and output signal processing means for outputting a voice / sound signal for each of the sound location processed participants to the user. It is done.

또한 본 발명의 원격회의를 위한 음상 정위 시스템에서, 상기 참석자들로부터 발생하는 음성/음향신호는 사용자를 기준으로 하여 각각의 참석자들별로 각각 서로 다른 방향을 갖도록 정위치시키는 것을 특징으로 한다.In addition, in the audio positioning system for teleconference of the present invention, the voice / sound signal generated from the attendees is positioned to have a different direction for each participant based on the user.

또한 본 발명의 원격회의를 위한 음상 정위 시스템에서, 상기 참석자들로부터 발생하는 음성/음향신호는 사용자를 기준으로 하여 각각의 참석자들이 배열된 가상의 회의장 화면상에서의 사용자와 참석자간의 위치와 방향에 대응하도록 그 음상이 배치되는 것을 특징으로 한다.In addition, in the audio positioning system for teleconference of the present invention, the voice / sound signal generated from the participants corresponds to the position and direction between the user and the participant on the virtual conference room screen where each participant is arranged based on the user. The sound image is arranged so as to.

또한 본 발명의 원격회의를 위한 음상 정위 시스템에서, 상기 참석자들로부터 발생하는 음성/음향신호는 사용자를 기준으로 하여 각각의 참석자들이 배열된 가상의 회의장 화면상에서의 사용자와 참석자간의 위치와 방향에 대응하되, 상기 가상의 회의장 화면상에서 참석자의 위치가 변화되면 그에 따라서 해당 참석자에 대한 음상정위도 함께 조정되는 것을 특징으로 한다.In addition, in the audio positioning system for teleconference of the present invention, the voice / sound signal generated from the participants corresponds to the position and direction between the user and the participant on the virtual conference room screen where each participant is arranged based on the user. However, when the position of the attendees is changed on the virtual conference hall screen, the stereophonic position of the attendees is adjusted accordingly.

또한 본 발명의 원격회의를 위한 음상 정위 시스템에서, 상기 각각의 참석자들이 배열된 가상의 회의장 화면상에서 사용자가 1인의 영상을 선택하면 선택된 참석자와 1:1 주대화 혹은 비밀대화가 설정되고, 이 경우의 선택된 상대방의 음성/음향에 대한 음상은 사용자의 머리 가운데에 정위치시키며, 다른 나머지 참석자들로부터의 음성/음향에 대한 음상은 나머지 다른 방향을 갖도록 정위치시키는 것을 특징으로 한다.In addition, in the audio location system for the teleconference of the present invention, if the user selects one video on the virtual conference room screen where each participant is arranged, a 1: 1 private or secret conversation with the selected participant is established, in this case, The sound image of the selected counterpart's voice / sound is positioned at the center of the user's head, and the sound image of the voice / sound from the other participants is positioned to have the other direction.

또한 본 발명의 원격회의를 위한 음상 정위 시스템에서, 상기 음상 정위 처리수단에 의한 음상 정위처리는 좌/우 채널별로 신호의 지연 정도 및/또는 레벨의 차이 정도를 각각 다르게 제어하여 이루어지는 것을 특징으로 한다.In addition, in the audio positioning system for teleconferencing of the present invention, the audio positioning process by the audio positioning process means is performed by differently controlling the degree of delay and / or level difference of the signal for each left / right channel. .

또한 본 발명의 원격회의를 위한 음상 정위 시스템에서, 상기 음상 정위 처리수단에 의한 음상 정위처리는 정위치시킬 방향에 대응하는 공간전달함수를 좌/우 채널의 신호와 상승적분하는 방법을 기반으로 이루어지는 것을 특징으로 한다.In addition, in the image positioning system for teleconferencing of the present invention, the image positioning processing by the image positioning processing means is based on a method of synergistically integrating a space transfer function corresponding to a direction to be positioned with a signal of a left / right channel. It is characterized by.

이하, 첨부된 도면을 참조하여 상기한 바와 같이 구성되는 본 발명의 원격회의를 위한 음상 정위 시스템의 실시예를 설명한다.Hereinafter, with reference to the accompanying drawings will be described an embodiment of an audio stereotactic system for a teleconference of the present invention configured as described above.

먼저, 본 발명에서 제안되고 있는 음상 정위에 대해서 살펴본다.First, the sound image position proposed in the present invention will be described.

공간 내부의 임의의 위치에 있는 음원(sound source)에서 발생하는 소리가 공간을 전파하여 사람에게 지각될 때, 사람은 이러한 소리를 듣고 소리의 발생 방향 등을 인지할 수 있다. 이 때 음원에서 발생한 소리에 대해서 청취자가 지각하는 바를 음상(sound image)이라고 하는데, 여기에는 소리신호의 전파 방향, 거리 등이 포함된다.When a sound generated from a sound source at an arbitrary position in the space propagates through the space and is perceived by the person, the person can hear the sound and recognize the direction of sound generation. At this time, the listener's perception of the sound generated from the sound source is called a sound image, which includes the propagation direction and distance of the sound signal.

사람이 소리를 듣고 방향을 인지할 수 있는 이유는 귀로 입사되는 소리신호의 양쪽 귀 사이의 레벨 차이와 동일한 신호가 각각의 귀에 도달하는 시간 차이 때문이다.The reason that a person can hear a sound and recognize a direction is due to the difference in level between the two ears of the sound signal incident to the ear and the time difference when the same signal reaches each ear.

본 발명의 음상 정위 시스템은 음원으로부터 발생하는 소리에 대하여 좌/우 채널에 해당하는 부분에 레벨의 차이와 도달 시간의 차이를 강제로 부여함으로써, 음상을 원하는 위치(방향)에 생기도록 하는 시스템이다.The sound image positioning system of the present invention is a system for generating a sound image at a desired position (direction) by forcibly giving a difference in level and arrival time to a portion corresponding to the left / right channel with respect to the sound generated from the sound source. .

일반적으로 음상을 정위치시킬 때, 방향별로 미리 정해둔 레벨의 차이나 도달 시간의 차이를 부여하거나, 다양한 방향에 대해서 미리 측정해 둔 공간 전달함수를 소리 신호에 적용(convolution)시킴으로써 본 발명의 음상 정위 시스템은 구현된다.In general, when the sound image is positioned correctly, the sound image position of the present invention is provided by giving a difference of a predetermined level or arrival time for each direction, or applying a spatial transfer function measured in advance to various directions to a sound signal. The system is implemented.

따라서, 본 발명의 원격회의를 위한 음상 정위 시스템은 참석자들로부터 발생하는 음성(음향 신호를 포함한다. 이하 같다.)에 대한 음상을 참석자들 별로 각각 다른 방향을 갖도록 정위치시킨다. 즉, 원격회의 시스템을 사용하고 있는 사용자가 귀를 통하여 받아들인 다음 느끼게 될 참석자들의 음성에 대한 음상을 참석자들별로 서로 다른 가상 공간에 정위치시킴으로써, 참석자들 상호간에 이루어지는 대화에 해당하는 음성에 대한 음상이 섞이지 않게 한다. 이렇게 하면 사용자가 화자 구분된 형태로 전체 회의를 청취할 수 있게 된다.Accordingly, the audio stereotactic system for teleconference of the present invention is to position the audio image for the voice (including the sound signal, as described below) from the participants to have a different direction for each participant. In other words, the user who is using the teleconference system receives the voice of the participant's voice that will be felt after hearing through the ear, and the participant's voice is placed in a different virtual space for each participant. Do not mix the sound. This allows the user to listen to the entire conference in speaker-delimited form.

이 때 참석자들로부터 발생하는 음성에 대한 음상의 방향은 가상의 회의장 화면에서 각 참석자들의 영상의 위치와 방향에 대응하도록 한다.At this time, the direction of the sound image generated by the participants corresponds to the position and direction of each participant's image on the virtual conference room screen.

또한, 사용자가 가상의 회의장 화면에서의 참석자들 중에서 1인의 영상을 선 택하면 1:1 주대화 혹은 비밀대화가 이루어질 수 있도록 한다. 그리고, 1:1의 주 대화시에 참석자의 음성에 대한 음상은 머리 가운데 정위치시키며, 다른 참석자들로부터의 음성은 다른 방향을 갖도록 다른 가상 공간의 소정 위치에 정위치시킴으로써, 1:1의 주 대화 환경의 보장을 극대화시키고, 주 대화에 대한 방해없이, 동시에 자연스럽게 청치할 수 있도록 한다.In addition, if the user selects one video from the participants in the virtual conference room screen, 1: 1 chat or secret chat can be made. And during the 1: 1 main conversation, the image of the participant's voice is placed in the center of the head, and the voice of the other participant is positioned at a predetermined position in another virtual space so as to have a different direction. Maximize the assurance of the dialogue environment and allow for natural clean-up at the same time without disturbing the main conversation.

도1은 본 발명에 따른 원격회의를 위한 음상 정위 시스템 실시예의 구성을 보여준다. 본 발명의 원격회의를 위한 음상 정위 시스템은, 원격회의 시스템의 음성정보 전송에 해당하는 부분으로서, 크게 나누어 참석자들로부터의 음성신호를 입력받는 입력부(10), 가상의 회의장 화면에서 참석자들에 대응하는 방향으로 각각의 참석자들의 음성신호를 음상 정위하는 정위 처리부(20), 그리고 스테레오 헤드폰/스피커로 출력하기 위해서 각 신호들을 합치는 부분인 출력부(30)로 구성되며, 여기에 시스템 제어를 위한 시스템 제어부(40)와 사용자 인터페이스부(50), 그리고 디스플레이부(60)를 포함하여 원격회의 시스템이 이루어진다.1 shows a configuration of an audio stereotactic system embodiment for teleconferencing according to the present invention. The audio stereotactic system for teleconferencing of the present invention is a part corresponding to the transmission of voice information of the teleconferencing system, which is divided into input units 10 for receiving a voice signal from the participants and corresponding to the participants on the virtual conference room screen. Orientation processing unit 20 for the audio image positioning of the voice signal of each participant in the direction, and the output unit 30 that is the portion to combine the signals to output to the stereo headphones / speakers, here for system control Teleconference system is implemented including a system control unit 40, a user interface unit 50, and a display unit 60.

상기 입력부(10)는 각각의 참석자들로부터 마이크를 통해서 입력되는 음성/음향신호를 디지털 신호로 변환하기 위한 A/D변환기(11,12,13)를 포함하며, 상기 음상 정위 처리부(20)는 각각의 참석자들로부터 입력되는 디지털 음성/음향신호에 대하여 좌/우 채널 각각의 신호지연이나 레벨의 차이를 부가하는 정위 처리기(211,212),(221,222),(231,232)를 포함하며, 상기 출력부(30)는 상기 각각의 참석자들별로 음상 정위처리된 음성/음향신호를 합해주는 가산기(31,32)와 상기 가산된 음상 정위된 음성/음향신호를 아날로그 신호로 변환하여 스피커나 헤드폰으로 출력 해 주기 위한 D/A변환기(33,34)를 포함한다.The input unit 10 includes A / D converters 11, 12, and 13 for converting voice / sound signals input through microphones from each participant into digital signals, and the sound image processing unit 20 includes: And a positioning processor (211,212), (221,222), (231,232) for adding a difference in signal delay or level of each of the left and right channels to the digital voice / sound signal inputted from each participant. 30) adders 31 and 32 for adding the stereotactic voice / sound signals for each participant, and convert the added audio stereotactic voice / sound signals into analog signals and output them to a speaker or a headphone. D / A converters 33 and 34.

상기 입력부(10)는 각 참석자들로부터 마이크를 통하여 입력된 아날로그 음성신호를 A/D변환기(11,12,13)를 통해서 디지털 신호로 변환한다.The input unit 10 converts the analog voice signal inputted through the microphone from each participant into a digital signal through the A / D converters 11, 12, 13.

상기 음상 정위 처리부(20)는 각각의 참석자들별로 그 음상을 정위치시킬 방향의 크기에 따라서 좌/우 채널별로 다른 정위치 처리를 수행하는데, 이 때 정위치시킬 방향은 각각의 참석자들별로 상이하며, 가상의 회의장 화면에서 사용자 및 참석자의 영상 위치에 대응한다.The image positioning processor 20 performs different positioning processing for each participant according to the size of the direction in which the sound image is to be positioned for each participant, and the direction to be positioned is different for each participant. It corresponds to the video position of the user and the attendee on the virtual conference room screen.

이 것은 각각의 음상 정위 처리기(211,212),(221,222),(231,232)에 의해서 각각의 참석자들 별로 서로 다른 지연 및 레벨 조정이 이루어짐으로써 구현된다.This is implemented by different delay and level adjustments for each participant by each of the sound image processing processors 211, 212, 221, 222, and 231, 232.

예를 들면, 참석자1의 음성/음향신호에 대해서, 좌측 채널의 정위 처리기(L정위)(211)에서는 좌측 채널 신호에 대한 신호 지연의 정도 및 신호 레벨을 조정하고, 또한 이와 함께 우측 채널의 정위 처리기(R정위)(212)에서는 우측 채널 신호에 대한 신호 지연의 정도 및 신호 레벨을 조정함으로써, 참석자1의 음상 정위를 다른 참석자들과 구분되게 수행한다.For example, with respect to the audio / audio signal of the participant 1, the left channel positioning processor (L positioning) 211 adjusts the degree of signal delay and signal level for the left channel signal, and together with the right channel positioning. The processor (R-position) 212 performs the audio position of participant 1 separately from other participants by adjusting the degree of signal delay and the signal level for the right channel signal.

즉, 본 발명에서의 정위 처리는 일반적으로 좌/우 채널별로 신호의 지연 정도와 레벨의 차이 정도를 달리하여 적용시키거나, 정위치시킬 방향에 대응하는 공간 전달함수(HRTF: Head Related Transfer Function)를 좌/우 채널의 신호와 상승적분하는 방법을 이용한다.That is, the stereotactic processing in the present invention is generally applied by varying the degree of delay and level of the signal for each left / right channel, or a spatial transfer function (HRTF) corresponding to the direction to be precisely positioned. The method uses the method of synergistic integration with the signals of the left and right channels.

이와 같이 참석자별로 음상 정위가 이루어진 음성/음향신호들은 출력부(30)에서 최종적으로 좌/우 채널별로 합해진 다음, 아날로그 음성신호로 변환하여 헤드 폰이나 스피커와 같은 출력장치로 보낸다. 즉, 제1 가산기(31)는 음상 정위된 참석자들의 좌측 채널 신호를 합하여 제1 D/A변환기(33)에 공급함으로써 좌측 채널의 음상 정위된 음성/음향신호가 아날로그 신호로 변환되어 스피커나 헤드폰으로 출력될 수 있게 하고, 제2 가산기(32)는 음상 정위된 참석자들의 우측 채널 신호를 합하여 제2 D/A변환기(34)에 공급함으로써 우측 채널의 음상 정위된 음성/음향신호가 아날로그 신호로 변환되어 스피커나 헤드폰으로 출력될 수 있게 한다.In this way, the voice / sound signals of which the sound position is formed for each participant are finally summed by the left / right channels in the output unit 30, and then converted into analog voice signals and sent to an output device such as a headphone or a speaker. In other words, the first adder 31 adds the left channel signals of the participants whose sound is located to the first D / A converter 33 and converts the sound or sound signals of the left channel to analog signals, thereby converting the speakers or headphones. The second adder 32 adds the right channel signal of the participants whose sound is located to the second D / A converter 34 to convert the sound or sound signal of the right channel into an analog signal. It can be converted and output to speakers or headphones.

그런데, 앞서 설명한 바와 같이 본 발명에 따른 원격 회의 시스템에서는 가상의 회의장 화면에서의 참석자들의 위치와 음상이 서로 대응하도록 정위 처리된다. 따라서, 시스템 제어부(40)는 디스플레이부(60)에 표시되는 가상의 회의장 화면(도2 참조)에서의 각각의 참석자들의 가상 공간상에서의 위치 정보를 갖고 상기 음상 정위 처리부(20)에서 해당 참석자에 대한 음성/음향신호의 정위처리를 제어하게 된다.However, as described above, in the teleconference system according to the present invention, the location and the sound image of the participants in the virtual conference room screen are processed to correspond to each other. Therefore, the system controller 40 has location information of each participant in the virtual space on the virtual conference room screen (see FIG. 2) displayed on the display unit 60, and the sound control system 20 sends the corresponding participant to the corresponding participant. Control of the stereotactic processing of voice / sound signals.

예를 들어, 도2에서 사용자를 기준으로 살펴본다면 참석자1은 사용자의 왼쪽에 위치하고, 참석자2는 사용자의 앞쪽에 위치하며, 참석자3은 사용자의 오른쪽에 위치한 것으로 가상의 공간-회의장 화면에서 배치되어 있다. 그러므로 참석자1, 참석자2, 참석자3의 가상 공간상에서의 위치에 대응하여 상기 음상 정위 처리부(20)에서도 참석자1의 음성/음향신호는 사용자의 왼쪽에 있는 것처럼 해당 음성/음향 신호의 지연 및 좌/우 채널의 레벨차를 주고, 참석자2의 음성/음향신호는 사용자의 앞쪽에 있는 것처럼 해당 음성/음향신호의 지연 및 좌/우 채널의 레벨차를 주며, 참석자3의 음성/음향신호는 사용자의 오른쪽에 있는 것처럼 해당 음성/음향신호의 지연 및 좌/우 채널의 레벨차를 주는 것이다.For example, referring to FIG. 2, the attendee 1 is located on the left side of the user, the attendee 2 is located in front of the user, and the attendee 3 is located on the right side of the user. have. Therefore, the audio / sound signal of the participant 1 is delayed and the left / right of the corresponding voice / sound signal as the left side of the user in the audio positioning process unit 20 corresponding to the position of the participant 1, the participant 2, and the participant 3 in the virtual space. The level of the right channel is given, and the participant 2's voice / sound signal gives the delay of the corresponding voice / sound signal and the level of the left / right channel as if it is in front of the user. As shown on the right, the delay of the corresponding voice / sound signal and the level difference between the left and right channels are given.

또한, 사용자가 사용자 인터페이스부(50)를 이용해서 상기 참석자들의 위치를 변경시키면 시스템 제어부(40)에서는 디스플레이부(60)에 표현되는 해당 참석자 영상의 위치를 변경해서 표시해 주며, 음상 정위 처리부(20)에서는 관련된 위치 정보를 토대로 하여 그 변경된 위치로 해당 참석자의 음상을 정위처리하여 준다.In addition, when the user changes the location of the attendees using the user interface unit 50, the system control unit 40 changes and displays the location of the corresponding participant image displayed on the display unit 60, and the sound localization processor 20 ) Locates the participant's image to the changed location based on the relevant location information.

그리고, 사용자 인터페이스부(50)를 이용해서 특정 참석자의 영상을 선택하면 그 사용자와의 1:1 주 대화(혹은 비밀대화)를 이룰 수 있도록, 선택된 상대방의 음성/음향에 대한 음상은 사용자의 머리 가운데에 정위치시키며, 다른 나머지 참석자들로부터의 음성/음향에 대한 음상은 나머지 다른 방향을 갖도록 정위치시켜 줌으로써, 1:1 주대화 환경을 보장해 준다.When the video of a specific participant is selected using the user interface unit 50, a voice of the selected counterpart's voice / sound may be formed so that a 1: 1 main conversation (or secret conversation) with the user is performed. It is placed in the center and the sound image of the voices / sounds from the rest of the participants is positioned in the other direction, thereby ensuring a 1: 1 main conversation environment.

본 발명은 원격회의 시스템에서 각각의 참석자들로부터의 음성신호들을 음상 정위함에 따라, 동시에 발생할 수 있는 음성신호들이 섞여서 청취될 때 각각의 발언내용이 포함하는 의미가 소실 및 손실되는 것을 방지할 수 있으며, 사용자가 보다 더 실제의 회의장에 있는 것과 같은 현실감을 느낄 수 있도록 해준다.According to the present invention, the voice signals from each participant in the teleconferencing system are subjected to sound image positioning, so that the meanings of the respective speech contents can be prevented from being lost and lost when the voice signals that can be generated at the same time are mixed and listened to. It allows the user to feel more realistic like being in a real conference room.

Claims

Input signal processing means for receiving and processing voice / sound signals of the participants in the teleconference, and audio-location processing means for disposing voice / sound signals of each of the input participants in different virtual spaces for each of the participants And an output signal processing means for outputting a voice / sound signal for each of the attendees who have been processed.

The system of claim 1, wherein the voice / sound signal generated from the attendees is positioned to have different directions for each attendee based on a user.

The method of claim 1, wherein the voice / sound signal generated from the attendees is arranged so that the respective attendees are arranged so as to correspond to the position and direction between the user and the attendees on the virtual conference room screen in which each attendee is arranged. A stereophonic positioning system for teleconferencing featuring.

The virtual conference hall of claim 1, wherein a voice / sound signal generated from the attendees corresponds to a position and a direction between a user and a participant on a virtual conference hall screen in which each participant is arranged based on a user. If a participant's position on the screen is changed, the stereophonic position system for the teleconference, characterized in that it is also adjusted accordingly.

The method of claim 1, wherein when a user selects one video on a virtual conference room screen in which each participant is arranged, 1: 1 or private conversation with the selected participant is set, and the voice / sound of the selected counterpart in this case is selected. A stereophonic stereotactic system for teleconferencing, wherein the tonal phase is positioned in the center of the user's head and the tonal phase of the voice / sound from the remaining participants is in the other direction.

2. The sound image positioning system for teleconference according to claim 1, wherein the sound image positioning processing by the sound image positioning processing means controls the degree of delay and / or level difference of the signal for each of the left and right channels differently. .

2. The teleconferencing process according to claim 1, wherein the sound image positioning processing by the sound image positioning processing means is based on a method of synergistically integrating a space transfer function corresponding to a direction to be positioned with signals of left and right channels. Acoustic stereotactic system for