WO2019237429A1 - Method, apparatus and system for assisting communication, and augmented reality glasses - Google Patents
Method, apparatus and system for assisting communication, and augmented reality glasses Download PDFInfo
- Publication number
- WO2019237429A1 WO2019237429A1 PCT/CN2018/092815 CN2018092815W WO2019237429A1 WO 2019237429 A1 WO2019237429 A1 WO 2019237429A1 CN 2018092815 W CN2018092815 W CN 2018092815W WO 2019237429 A1 WO2019237429 A1 WO 2019237429A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sound source
- voice
- text
- speech
- preset
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B21/00—Teaching, or communicating with, the blind, deaf or mute
- G09B21/009—Teaching or communicating with deaf persons
-
- G—PHYSICS
- G02—OPTICS
- G02B—OPTICAL ELEMENTS, SYSTEMS OR APPARATUS
- G02B27/00—Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
- G02B27/01—Head-up displays
- G02B27/017—Head mounted
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G02—OPTICS
- G02B—OPTICAL ELEMENTS, SYSTEMS OR APPARATUS
- G02B27/00—Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
- G02B27/01—Head-up displays
- G02B27/0101—Head-up displays characterised by optical features
- G02B2027/014—Head-up displays characterised by optical features comprising information/image processing systems
-
- G—PHYSICS
- G02—OPTICS
- G02B—OPTICAL ELEMENTS, SYSTEMS OR APPARATUS
- G02B27/00—Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
- G02B27/01—Head-up displays
- G02B27/0101—Head-up displays characterised by optical features
- G02B2027/0141—Head-up displays characterised by optical features characterised by the informative content of the display
Definitions
- the present invention relates to the field of augmented reality technology, and in particular, to a method, device, and system for assisting communication, and augmented reality glasses.
- Augmented Reality (AR) technology is a technology that calculates the position and angle of an image in real time, superimposes the corresponding image, video, and 3D model on the image, and then fuses the virtual world with the real world.
- the AR client can combine real-time image recognition of the offline environment of the user with the pictures stored directly in its local image recognition, and according to the pre-configured display of the identified offline targets in the real scene The effect is enhanced to display the corresponding display data.
- augmented reality technology is widespread, but for the hearing impaired, augmented reality technology has not helped them well.
- An object of the present invention is to provide a method, a device, and a system for assisting communication, and augmented reality glasses, which can enable a hearing impaired person to accurately grasp the content of a voice emitted by a sound source and the position of the sound source.
- one aspect of the present invention provides a method for assisting a hearing impaired person to communicate, the method comprising: receiving a voice of at least one sound source; and based on each of the received at least one sound source , Determine the position of each sound source in the at least one sound source; identify the sound of each sound source in the at least one sound source to convert the sound of each sound source in the at least one sound source into Text; and text that translates the position and speech of each of the at least one sound source.
- the method further includes: receiving text; converting the received text into speech; and playing the converted speech.
- the form of displaying the orientation of each sound source of the at least one sound source and the text converted by the voice is any one of the following: using a preset foreground color and a preset background color for display and using a preset foreground color
- the orientation and text corresponding to different sound sources are displayed by changing colors alternately with the preset background color, wherein the preset foreground color and the preset background color are different colors.
- the preset foreground color is white and the preset background color is green; or the preset foreground color is green and the preset background color is white.
- the method further comprises: determining location information of the hearing impaired person; and sending the location information to a mobile terminal and / or client, so that the mobile terminal and / or client obtains the location in real time information.
- the method before sending the location information to a mobile terminal and / or client, the method further includes: receiving a setting for a contact, wherein the mobile terminal and / or client is a contact with the selected one A mobile terminal and / or client corresponding to a person.
- the device includes: a voice receiving module for receiving voice of at least one sound source; and a determining module for receiving information based on all received signals.
- the speech of each sound source in the at least one sound source is described, and the position of each sound source in the at least one sound source is determined;
- the speech recognition module is configured to identify the sound of each sound source in the at least one sound source, and Convert the speech of each sound source in the at least one sound source into text; and a display module for displaying the position of each sound source in the at least one sound source and the text converted by the speech.
- the device further includes: a text receiving module for receiving text; a text conversion module for converting the received text into speech; and a voice playback module for playing the converted speech.
- the display module is a near-eye display.
- the near-eye display is a see-through near-eye display.
- the form of displaying the orientation of each sound source of the at least one sound source and the text converted by the voice is any one of the following: using a preset foreground color and a preset background color for display and using a preset foreground color
- the orientation and text corresponding to different sound sources are displayed by changing colors alternately with the preset background color, wherein the preset foreground color and the preset background color are different colors.
- the preset foreground color is white and the preset background color is green; or the preset foreground color is green and the preset background color is white.
- the device further includes: a positioning module for determining position information of the hearing impaired; and a communication module for sending the position information to a mobile terminal and / or a client, so that the mobile terminal And / or the client obtains the location information in real time.
- a positioning module for determining position information of the hearing impaired
- a communication module for sending the position information to a mobile terminal and / or a client, so that the mobile terminal And / or the client obtains the location information in real time.
- the device further includes: a contact setting module configured to receive the setting of the contact before the communication module sends the location information to the mobile terminal and / or the client, wherein the mobile terminal And / or the client is a mobile terminal and / or client corresponding to the selected contact.
- a contact setting module configured to receive the setting of the contact before the communication module sends the location information to the mobile terminal and / or the client, wherein the mobile terminal And / or the client is a mobile terminal and / or client corresponding to the selected contact.
- another aspect of the present invention provides an augmented reality glasses, which includes the above-mentioned device.
- another aspect of the present invention also provides a system for assisting the hearing impaired to communicate, the system includes: the above-mentioned device; and a client.
- another aspect of the present invention also provides a machine-readable storage medium, where the machine-readable storage medium stores instructions, and the instructions are used to make the method of the machine described above.
- displaying the translated text of each sound source of at least one sound source enables a hearing impaired person to understand the content of the sound of each sound source; displaying the position of each sound source of at least one sound source, achieving A simple and intuitive way to indicate where the voice of the hearing impaired occurs.
- the hearing-impaired person can obtain the positional awareness similar to ordinary people while watching the "subtitles" to understand the voice information being heard, thereby enabling the hearing-impaired person to understand the content of the voice emitted by each sound source.
- the position of each sound source so that the hearing impaired can accurately grasp the content of the sound emitted by each sound source and the position of the sound source, which is convenient for the hearing impaired to communicate and communicate with others.
- accurately grasping the content of the sound emitted by each sound source and the position of the sound source is very helpful for the communication of hearing impaired people with others.
- FIG. 1 is a flowchart of a method for assisting a hearing-impaired person to communicate according to an embodiment of the present invention
- FIG. 2 is an exemplary diagram using arrows to indicate directions according to another embodiment of the present invention.
- FIG. 3 is an exemplary diagram of an orientation provided by another embodiment of the present invention.
- FIG. 4 is an exemplary diagram showing the position of a sound source and the text converted by speech provided by another embodiment of the present invention.
- FIG. 5 is an exemplary diagram showing the positions of multiple sound sources and text converted by voice according to another embodiment of the present invention.
- FIG. 6 is a flowchart of a method for assisting a hearing impaired person to communicate according to another embodiment of the present invention.
- FIG. 7 is a structural block diagram of a device for assisting a hearing-impaired person to communicate according to another embodiment of the present invention.
- FIG. 1 is a flowchart of a method for assisting a hearing impaired person to communicate according to an embodiment of the present invention. As shown in FIG. 1, the method includes the following steps.
- step S10 the voice of at least one sound source is received.
- step S11 the position of each sound source in the at least one sound source is determined based on the speech of each sound source in the received at least one sound source.
- determining the position of the sound source may be based on the time when the voice emitted from the sound source is received.
- a voice receiving module for receiving voice of at least one sound source includes a plurality of voice acquisition modules, the plurality of voice acquisition modules are disposed at different positions, and the plurality of voice acquisition modules receives voices from the same sound source. Time is different.
- the position of the sound source is determined according to the difference in the time when the speech arrives at the multiple speech collection modules, that is, according to the time difference between the time when the speech reaches the multiple speech collection modules.
- the voice collection module may be a microphone
- the voice receiving module may be a microphone array.
- the microphone array may include 2, 4, 6, 7, or 8 microphones.
- the reference point of the azimuth may be set according to the actual situation, for example, it may be the position where the voice receiving module is located. Specifically, it may be any one of the plurality of voice acquisition modules, or may be an intermediate position of the plurality of voice acquisition modules.
- the speech receiving module is worn by the hearing-impaired person or is not far from the hearing-impaired person
- the speech-receiving module is used as the reference point, which is actually the hearing-impaired person as the reference point. In this way, the hearing-impaired person can Azimuth knows where the sound source is relative to itself.
- the orientation may include a direction and / or a distance.
- an arrow may be used to indicate the direction.
- the arrow is located in an area delimited by a circle.
- the starting point of the arrow is the origin of the circle.
- the origin is equivalent to the position of the hearing impaired and the arrow deviates.
- the vertical axis passing through the circle is at an angle, as shown in FIG. 2, and the vertical dotted line in the figure is the vertical axis passing through the circle.
- the horizontal axis of the circle is used as a reference. As shown by the horizontal dashed line in FIG.
- the arrow when the arrow is located above the horizontal axis, it means that the sound source is in front of the hearing impaired; when the arrow is located below the horizontal axis, Indicates that the sound source is behind the hearing impaired. For example, taking the direction example shown in 2 as an example, the sound source indicated by the arrow is in front of the hearing impaired person.
- the way in which the direction of the sound source is indicated by arrows can also be interpreted as the direction is indicated by a clock.
- the circle represents the dial, and the vertical axis located in the upper half of the horizontal axis of the circle indicates the 12 o'clock direction. According to the angle at which the arrow deviates from 12 o'clock, it is determined that the sound source is about the clock direction.
- the sound source indicated by the arrow is about 10 o'clock.
- the azimuth can be expressed using an example as shown in FIG. 3.
- the position of the display distance can be set according to the actual situation, and there is no limitation on this.
- the azimuth includes only the distance, only the distance may be displayed.
- text may also be used to describe the orientation. For example, taking the orientation shown in FIG. 3 as an example, the text "direction is ten o'clock and distance is 50 cm" may be displayed.
- step S12 the speech of each sound source in the at least one sound source is identified to convert the speech of each sound source in the at least one sound source into text.
- speech recognition technology is used to convert speech into text.
- step S13 the position of each of the at least one sound source and the text converted by the voice are displayed.
- An example showing the position of each sound source and the text converted by voice can be shown in FIG. 4.
- when displaying the speech-transformed text of a certain sound source if all the text cannot be displayed in one line, it can be displayed in a new line or scrolled.
- FIG. 4 only shows by way of example the position of the area where the orientation and the text converted by voice are displayed.
- the position of the two display areas can be selected according to the actual situation. For the two display areas, The position is not limited.
- the example shown in FIG. 4 may be used for display.
- the sound sources may be displayed in a manner of being arranged one above the other in sequence, as shown in FIG. 5.
- the sound sources may be displayed in the order of the left and right of the sound sources, or displayed in other arrangements, which is not limited.
- the preset foreground color and the preset background color are used for display, and the preset foreground color and the preset background color are different colors.
- the preset foreground color is white and the preset background color is black to display white characters on a black background; or the preset foreground color is black and the preset background color is white to display black characters on a white background.
- the preset foreground color is white, the preset background color is green, and the white text on a green background is displayed; or the preset foreground color is green, the preset background color is white, and the green text on a white background is displayed.
- the way to display the orientation and text may also be to display the orientation and text corresponding to different sound sources by alternately changing the color of the preset foreground color and the preset background color, that is, the preset foreground color and the preset background color are different.
- Color according to the order of received speech, when the sound source corresponding to the received adjacent speech is a different sound source, the preset foreground color and the preset background color are alternately changed; when the sound source corresponding to the received adjacent speech is the same
- the preset foreground color and preset background color do not change colors.
- the colors of the preset foreground color and the preset background color can be limited according to actual conditions.
- the preset foreground color is white, the preset background color is black, and the white text on a black background is displayed; or the preset foreground color is black.
- the preset background color is white, and black characters on a white background are displayed; for example, the preset foreground color is white, the preset background color is green, and white characters on a green background are displayed; or the preset foreground color is green, and the preset background color is white Displays green text on a white background.
- the following is an example of using the preset foreground color to be white and the preset background color to be green to alternately change the preset foreground color and the preset background color to display the orientation and text corresponding to different sound sources.
- a certain voice corresponds to the first sound source
- the text converted by the first voice and the orientation of the first sound source are displayed in white on a green background
- the sound source corresponding to the next voice of the first voice named second voice
- the first sound source is named second sound source
- the text converted by the second voice and the position of the second sound source are in green on a white background.
- the sound source corresponding to the next voice of the second voice is the second sound source.
- the green text on a white background is still used.
- the sound source corresponding to the next voice of the third voice named the fourth voice
- the sound source is a different sound source (the fourth sound source is named the third sound source, and the third sound source may be the first sound source or other sound sources, as long as it is not the second sound source),
- the text of the fourth voice conversion and the third sound source are displayed Azimuth using green and white, so, the cycle continues until all of the received display information corresponding to the voice is completed, wherein the information corresponding to the speech sound source and text orientation including voice corresponding to speech conversion.
- the user's operation experience is extremely light, and he / she can “hear” information beyond his ability without operating a technical system at all.
- the method for assisting the hearing-impaired person provided by the embodiment of the present invention can be applied not only to the hearing-impaired person, but also to ordinary people.
- FIG. 6 is a flowchart of a method for assisting a hearing impaired person to communicate according to another embodiment of the present invention. The difference from the method shown in FIG. 1 is that the method shown in FIG. 6 further includes the following content.
- step S64 a character is received.
- connecting a keyboard enables a hearing impaired person to enter text through the keyboard.
- a hearing impaired person can enter text through the interactive interface.
- the client may be a mobile APP.
- step S65 the received text is converted into speech, for example, text-to-speech conversion is achieved by using TTS technology.
- step S66 the converted voice is played.
- the hearing-impaired person when the hearing-impaired person does not have the pronunciation ability or the pronunciation ability is limited, the hearing-impaired person can input text to express his meaning and communicate with others.
- steps S64 to S66 may also be performed before steps S60 to S63, and there is no restriction on this.
- the method for assisting a hearing impaired person to communicate may further include the following content: The language of the text and / or the language of the speech into which the text is converted.
- the “hearing impaired person” may not be a person with limited hearing ability, but may be a “first-view equal hearing impaired person” who does not understand the language of others Of people who are different in their language are "second-view parity hearing impaired.”
- the method for assisting the hearing impaired to communicate may further include the following: determining the position information of the hearing impaired; and sending the position information to the mobile terminal and / or the client to The mobile terminal and / or the client can obtain the location information in real time.
- the contact person related to the hearing impaired person can obtain the position information of the hearing impaired person in real time to confirm whether he is safe, and can find him as soon as possible when a situation occurs.
- the position information of the hearing-impaired person can be determined in real time through GPS positioning technology.
- the method before sending the location information to the mobile terminal and / or the client, the method further includes: receiving a setting of a contact, wherein the mobile terminal and / or the client are connected with the selected The mobile terminal and / or client corresponding to the specified contact.
- the hearing-impaired person's location information can be sent directly to the contacts who can appear in time, so that when the hearing-impaired person has difficulty, he can reach the hearing-impaired person's location as soon as possible, helping the hearing-impaired person to solve the problem .
- the correspondence relationship between the hearing-impaired persons and the mobile terminals and / or clients used by them can be set in advance.
- the method for assisting the hearing impaired to communicate may further include the following: according to the order of receiving voices, recording the positions and characters corresponding to each sound source, and comparing the positions and corresponding positions of each sound source The text is stored locally or in the cloud to further help the hearing impaired to remember and share afterwards.
- FIG. 7 is a device for assisting people with hearing impairment to communicate according to another embodiment of the present invention.
- the device includes a voice receiving module 1, a determining module 2, a voice recognition module 3, and a display module 4.
- the voice receiving module 1 is configured to receive voice of at least one sound source.
- the determining module 2 is configured to determine the position of each sound source in the at least one sound source based on the received speech of each sound source in the at least one sound source.
- the voice recognition module 3 is used for recognizing the voice of each of the at least one sound source to convert the voice of each of the at least one sound source into text.
- the display module 4 is configured to display the position of each sound source and the text converted by the voice in at least one sound source.
- the method for assisting the hearing impaired to provide communication provided by the embodiment of the present invention can be applied not only to the hearing impaired, but also to ordinary people.
- the device for assisting the hearing impaired to communicate further includes a text receiving module for receiving text; a text conversion module for converting the received text into speech; and speech A playback module for playing the converted voice.
- the display module may be a near-eye display.
- the distance between the near-eye display and the eyeball may be less than 2 cm.
- the near-eye display may include a see-through near-eye display or a non-see-through near-eye display. In this way, the position of each sound source and the text converted by the voice are presented in front of the eyes.
- the display module may be a see-through near-eye display. In this way, while not affecting the hearing-impaired person's observation of other things, the hearing-impaired person can understand the position of each sound source and the text converted by the voice by watching "subtitles".
- the device further includes: a positioning module for determining the location information of the hearing impaired; and a communication module for sending the location information to the mobile terminal and / or the client so that the mobile The terminal and / or the client obtains the location information in real time.
- the device further includes: a contact setting module, configured to receive the setting of the contact before the communication module sends the location information to the mobile terminal and / or the client,
- the mobile terminal and / or client is a mobile terminal and / or client corresponding to the selected contact.
- the device for assisting a hearing-impaired person to communicate further includes a storage module.
- the storage module is used to record the position and text corresponding to each sound source according to the order of receiving voices, to further help the hearing impaired to remember and share afterwards.
- the storage module records the orientation and text corresponding to each sound source, which may be storing the orientation and text corresponding to each sound source on the local end or in the cloud.
- the specific working principle and benefits of the device for assisting the hearing impaired to provide communication provided by the embodiment of the present invention are similar to the specific working principle and benefits of the method for assisting the hearing impaired to provide communication provided by the embodiment of the present invention. More details.
- another aspect of the embodiments of the present invention provides a system for assisting a hearing impaired person to communicate.
- the system includes: the device described in the foregoing embodiment and a client.
- the client can receive text input by the user; and / or can receive location information of the hearing impaired.
- augmented reality glasses include the devices described in the above embodiments.
- the augmented reality glasses include an electronic circuit system that supports the operation of the device described in the foregoing embodiment.
- the electronic circuit system includes a power source, a processor, a network connection module, and a voice receiving module, a text receiving module, and a voice playing module.
- the electronic circuit system may further include an externally visible human-machine interface module and buttons and / or a touch control panel.
- the processor includes the determination module, the speech recognition module, and the text conversion module described in the foregoing embodiments.
- the human-machine interface module includes a display module.
- the processor can also perform offline speech recognition locally, or online speech recognition in the cloud via a network connection.
- the touch control panel, the buttons, and / or the voice receiving module may be provided on the glasses or glasses accessories of the augmented reality glasses, for example, on the temples, frames, or lenses.
- the voice receiving module may be disposed on the frame, on the same temple or on a different temple, or at a position close to the ears (both ears or monaural), to reach the pole. Try to fit the ear.
- the voice receiving module is a microphone array and the microphone sub-array includes two microphones, the two microphones are respectively disposed on two frames, or are disposed on different positions of the same temple, or are respectively disposed on two On the temples.
- a plurality of microphones may also be respectively arranged on the frame and / or the temple according to the actual situation.
- the use of a microphone array is of great significance. Using a microphone array does not require the distance of the sound source from the voice receiving module. In addition, the use of the microphone array can adapt to various distances and can meet the requirements in most communication scenarios. The distance refers to the distance between the sound source and the microphone array.
- the distance between the sound source and the voice receiving module is between 50cm and 1m; in a multi-person group conversation, the sound source is between the distance of 1m and 2m from the voice receiving module; conference , The distance between the sound source and the voice receiving module is 3m; in class, the distance between the sound source and the voice receiving module is 3m to 5m, and so on.
- the display module is a near-eye display
- the orientation of each sound source and the text converted by the voice are presented in front of the eyes.
- the near-eye display may be transparent or non-transparent.
- the near-eye display is a see-through near-eye display, it does not affect the hearing-impaired person's observation of the display scene, and through the graphical instructions superimposed on the real scene, the hearing-impaired person can see each The position of a sound source and the text converted by the voice makes the hearing impaired person watching "subtitles" to understand the voice information heard and obtain the position perception similar to ordinary people.
- the near-eye display may be a monochrome display, which uses a preset background color and a preset foreground color to display the orientation and text corresponding to the sound source.
- the near-eye display can also be a color display. The background color and foreground color are alternately displayed to display the orientation and text corresponding to different sound sources.
- specific conversion methods refer to the content described in the foregoing embodiment. In this way, it can also be fully avoided. Hearing-impaired people are distracted, allowing them to focus on the content itself; at the same time, they allow normal hearing exchanges without the discomfort of being interrupted and needing to change their focus.
- another aspect of the embodiments of the present invention further provides a machine-readable storage medium, where the machine-readable storage medium stores instructions, and the instructions are used to cause a machine to execute the method described in the foregoing embodiments.
- displaying the translated text of each sound source in at least one sound source enables the hearing impaired person to understand the content of each sound source's speech; displaying the position of each sound source in at least one sound source, achieving A simple and intuitive way to indicate where the voice of the hearing impaired occurs.
- the hearing-impaired person can obtain the positional awareness similar to ordinary people while watching the "subtitles" to understand the voice information being heard, thereby enabling the hearing-impaired person to understand the content of the voice emitted by each sound source.
- the position of each sound source so that the hearing impaired can accurately grasp the content of the sound emitted by each sound source and the position of the sound source, which is convenient for the hearing impaired to communicate and communicate with others.
- the text input by the hearing impaired is converted into speech and the converted speech is played.
- the hearing impaired person can express his meaning by entering text and communicate with others.
- the received speech is converted into words expressed in the language used by the "deaf hearing impaired” and / or the words entered by the "deaf hearing impaired” are converted into others who communicate with the "deaf The speech expressed in the language used thus realizes the communication between the "persons with hearing impairment” and others.
- the program is stored in a storage medium and includes a number of instructions to enable a microcontroller, a chip, or a processor. (processor) executes all or part of the steps of the method described in each embodiment of the present application.
- the foregoing storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes .
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- Human Computer Interaction (AREA)
- Optics & Photonics (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Business, Economics & Management (AREA)
- Educational Administration (AREA)
- Educational Technology (AREA)
- Theoretical Computer Science (AREA)
- User Interface Of Digital Computer (AREA)
- Telephone Function (AREA)
Abstract
Description
本发明涉及增强现实技术领域,具体地涉及一种用于辅助交流的方法、装置和系统及增强现实眼镜。The present invention relates to the field of augmented reality technology, and in particular, to a method, device, and system for assisting communication, and augmented reality glasses.
增强现实(Augmented Reality,AR)技术,是一种通过实时计算影像的位置及角度,在影像上叠加相应的图像、视频、3D模型,进而对虚拟世界与现实世界进行融合的技术。AR客户端可以结合直接存储在其本地的图片识别物料,对用户的线下环境进行实时的图像识别,并在识别出的特定的线下目标在真实场景中的位置上,按照预配置的展示效果增强显示相应的展示数据。随着技术的发展,增强现实技术的应用很广泛,但对于听障人士而言,增强现实技术却没有很好的帮助到他们。Augmented Reality (AR) technology is a technology that calculates the position and angle of an image in real time, superimposes the corresponding image, video, and 3D model on the image, and then fuses the virtual world with the real world. The AR client can combine real-time image recognition of the offline environment of the user with the pictures stored directly in its local image recognition, and according to the pre-configured display of the identified offline targets in the real scene The effect is enhanced to display the corresponding display data. With the development of technology, the application of augmented reality technology is widespread, but for the hearing impaired, augmented reality technology has not helped them well.
当前,听障人士与健听人沟通主要通过以下两种途径:手语翻译员或佩戴助听器。但是,这两中沟通途径,对听障人士而言都存在一定的问题,尤其是在多人交流的环境条件下。At present, there are two main ways for hearing impaired people and hearing people to communicate: sign language interpreters or hearing aids. However, these two communication channels have certain problems for the hearing impaired, especially under the environment of multi-person communication.
发明内容Summary of the Invention
本发明的目的是提供一种用于辅助交流的方法、装置和系统及增强现实眼镜,其可实现使得听障人士准确把握声源发出的语音的内容及声源的方位。An object of the present invention is to provide a method, a device, and a system for assisting communication, and augmented reality glasses, which can enable a hearing impaired person to accurately grasp the content of a voice emitted by a sound source and the position of the sound source.
为了实现上述目的,本发明的一个方面提供一种用于辅助听障人士进行交流的方法,该方法包括:接收至少一个声源的语音;基于接收的所述至少一个声源中的每一声源的语音,确定所述至少一个声源中的每一声源 的方位;识别所述至少一个声源中的每一声源的语音,以将所述至少一个声源中的每一声源的语音转化成文字;以及显示所述至少一个声源中的每一声源的方位及语音所转化的文字。To achieve the above object, one aspect of the present invention provides a method for assisting a hearing impaired person to communicate, the method comprising: receiving a voice of at least one sound source; and based on each of the received at least one sound source , Determine the position of each sound source in the at least one sound source; identify the sound of each sound source in the at least one sound source to convert the sound of each sound source in the at least one sound source into Text; and text that translates the position and speech of each of the at least one sound source.
可选地,该方法还包括:接收文字;将所接收的文字转化为语音;以及播放所转化的语音。Optionally, the method further includes: receiving text; converting the received text into speech; and playing the converted speech.
可选地,显示所述至少一个声源中的每一声源的方位及语音所转化的文字的形式为以下任一者:采用预设前景色和预设背景色进行显示及采用预设前景色和预设背景色交替变换颜色的方式显示不同声源对应的方位和文字,其中所述预设前景色与所述预设背景色为不同种颜色。Optionally, the form of displaying the orientation of each sound source of the at least one sound source and the text converted by the voice is any one of the following: using a preset foreground color and a preset background color for display and using a preset foreground color The orientation and text corresponding to different sound sources are displayed by changing colors alternately with the preset background color, wherein the preset foreground color and the preset background color are different colors.
可选地,所述预设前景色为白色,所述预设背景色为绿色;或所述预设前景色为绿色,所述预设背景色为白色。Optionally, the preset foreground color is white and the preset background color is green; or the preset foreground color is green and the preset background color is white.
可选地,该方法还包括:确定所述听障人士的位置信息;以及向移动终端和/或客户端发送所述位置信息,以使得所述移动终端和/或客户端实时获取所述位置信息。Optionally, the method further comprises: determining location information of the hearing impaired person; and sending the location information to a mobile terminal and / or client, so that the mobile terminal and / or client obtains the location in real time information.
可选地,在向移动终端和/或客户端发送所述位置信息之前,该方法还包括:接收对联系人的设定,其中所述移动终端和/或客户端为与所选定的联系人对应的移动终端和/或客户端。Optionally, before sending the location information to a mobile terminal and / or client, the method further includes: receiving a setting for a contact, wherein the mobile terminal and / or client is a contact with the selected one A mobile terminal and / or client corresponding to a person.
相应地,本发明的另一方面提供一种用于辅助听障人士进行交流的装置,该装置包括:语音接收模块,用于接收至少一个声源的语音;确定模块,用于基于接收的所述至少一个声源中的每一声源的语音,确定所述至少一个声源中的每一声源的方位;语音识别模块,用于识别所述至少一个声源中的每一声源的语音,以将所述至少一个声源中的每一声源的语音转化成文字;以及显示模块,用于显示所述至少一个声源中的每一声源的方位及语音所转化的文字。Accordingly, another aspect of the present invention provides a device for assisting a hearing impaired person to communicate. The device includes: a voice receiving module for receiving voice of at least one sound source; and a determining module for receiving information based on all received signals. The speech of each sound source in the at least one sound source is described, and the position of each sound source in the at least one sound source is determined; the speech recognition module is configured to identify the sound of each sound source in the at least one sound source, and Convert the speech of each sound source in the at least one sound source into text; and a display module for displaying the position of each sound source in the at least one sound source and the text converted by the speech.
可选地,该装置还包括:文字接收模块,用于接收文字;文字转化模块,用于将所接收的文字转化为语音;以及语音播放模块,用于播放所转 化的语音。Optionally, the device further includes: a text receiving module for receiving text; a text conversion module for converting the received text into speech; and a voice playback module for playing the converted speech.
可选地,所述显示模块为近眼显示器。Optionally, the display module is a near-eye display.
可选地,所述近眼显示器为透视式近眼显示器。Optionally, the near-eye display is a see-through near-eye display.
可选地,显示所述至少一个声源中的每一声源的方位及语音所转化的文字的形式为以下任一者:采用预设前景色和预设背景色进行显示及采用预设前景色和预设背景色交替变换颜色的方式显示不同声源对应的方位和文字,其中所述预设前景色与所述预设背景色为不同种颜色。Optionally, the form of displaying the orientation of each sound source of the at least one sound source and the text converted by the voice is any one of the following: using a preset foreground color and a preset background color for display and using a preset foreground color The orientation and text corresponding to different sound sources are displayed by changing colors alternately with the preset background color, wherein the preset foreground color and the preset background color are different colors.
可选地,所述预设前景色为白色,所述预设背景色为绿色;或所述预设前景色为绿色,所述预设背景色为白色。Optionally, the preset foreground color is white and the preset background color is green; or the preset foreground color is green and the preset background color is white.
可选地,该装置还包括:定位模块,用于确定所述听障人士的位置信息;以及通信模块,用于向移动终端和/或客户端发送所述位置信息,以使得所述移动终端和/或客户端实时获取所述位置信息。Optionally, the device further includes: a positioning module for determining position information of the hearing impaired; and a communication module for sending the position information to a mobile terminal and / or a client, so that the mobile terminal And / or the client obtains the location information in real time.
可选地,该装置还包括:联系人设定模块,用于在所述通信模块向移动终端和/或客户端发送所述位置信息之前,接收对联系人的设定,其中所述移动终端和/或客户端为与所选定的联系人对应的移动终端和/或客户端。Optionally, the device further includes: a contact setting module configured to receive the setting of the contact before the communication module sends the location information to the mobile terminal and / or the client, wherein the mobile terminal And / or the client is a mobile terminal and / or client corresponding to the selected contact.
此外,本发明的另一方面还提供一种增强现实眼镜,该增强现实眼镜包括上述的装置。In addition, another aspect of the present invention provides an augmented reality glasses, which includes the above-mentioned device.
另外,本发明的另一方面还提供一种用于辅助听障人士进行交流的系统,该系统包括:上述的装置;以及客户端。In addition, another aspect of the present invention also provides a system for assisting the hearing impaired to communicate, the system includes: the above-mentioned device; and a client.
此外,本发明的另一方面还提供一种机器可读存储介质,该机器可读存储介质上存储有指令,该指令用于使得机器上述的方法。In addition, another aspect of the present invention also provides a machine-readable storage medium, where the machine-readable storage medium stores instructions, and the instructions are used to make the method of the machine described above.
通过上述技术方案,显示至少一个声源中的每一声源的语音转化的文字使得听障人士可以明白每一声源的语音的内容;显示至少一个声源中的每一声源的方位,实现了以简单、直观的方式提示听障人士语音的发声位置。如此,使得听障人士在观看“字幕”以理解收听到的语音信息的同时获取类似于常人的对位置的感知,实现了使得听障人士可以明白每个声源 发出的语音的内容的同时能够清楚、直观的了解每个声源的方位,如此,使得听障人士准确把握每个声源发出的语音的内容及声源的方位,便于听障人士与他人的沟通和交流。特别地,在存在多个声源的环境下,准确把握每个声源发出的语音的内容及声源的方位,非常有助于听障人士与他人的交流。Through the above technical solution, displaying the translated text of each sound source of at least one sound source enables a hearing impaired person to understand the content of the sound of each sound source; displaying the position of each sound source of at least one sound source, achieving A simple and intuitive way to indicate where the voice of the hearing impaired occurs. In this way, the hearing-impaired person can obtain the positional awareness similar to ordinary people while watching the "subtitles" to understand the voice information being heard, thereby enabling the hearing-impaired person to understand the content of the voice emitted by each sound source. Clearly and intuitively understand the position of each sound source, so that the hearing impaired can accurately grasp the content of the sound emitted by each sound source and the position of the sound source, which is convenient for the hearing impaired to communicate and communicate with others. In particular, in an environment where there are multiple sound sources, accurately grasping the content of the sound emitted by each sound source and the position of the sound source is very helpful for the communication of hearing impaired people with others.
本发明实施例的其它特征和优点将在随后的具体实施方式部分予以详细说明。Other features and advantages of the embodiments of the present invention will be described in detail in the following detailed implementation sections.
附图是用来提供对本发明实施例的进一步理解,并且构成说明书的一部分,与下面的具体实施方式一起用于解释本发明实施例,但并不构成对本发明实施例的限制。在附图中:The drawings are used to provide a further understanding of the embodiments of the present invention, and constitute a part of the description. Together with the following specific implementations, the drawings are used to explain the embodiments of the present invention, but not to limit the embodiments of the present invention. In the drawings:
图1是本发明一实施例提供的用于辅助听障人士进行交流的方法的流程图;FIG. 1 is a flowchart of a method for assisting a hearing-impaired person to communicate according to an embodiment of the present invention; FIG.
图2是本发明另一实施例提供的使用箭头表示方向的示例图;FIG. 2 is an exemplary diagram using arrows to indicate directions according to another embodiment of the present invention; FIG.
图3是本发明另一实施例提供的方位的示例图;FIG. 3 is an exemplary diagram of an orientation provided by another embodiment of the present invention; FIG.
图4是本发明另一实施例提供的显示一声源的方位及语音转化的文字的示例图;FIG. 4 is an exemplary diagram showing the position of a sound source and the text converted by speech provided by another embodiment of the present invention; FIG.
图5是本发明另一实施例提供的显示多个声源的方位及语音转化的文字的示例图;FIG. 5 is an exemplary diagram showing the positions of multiple sound sources and text converted by voice according to another embodiment of the present invention; FIG.
图6是本发明另一实施例提供的用于辅助听障人士进行交流的方法的流程图;以及6 is a flowchart of a method for assisting a hearing impaired person to communicate according to another embodiment of the present invention; and
图7是本发明另一实施例提供的用于辅助听障人士进行交流的装置的结构框图。FIG. 7 is a structural block diagram of a device for assisting a hearing-impaired person to communicate according to another embodiment of the present invention.
附图标记说明Reference Signs
1 语音接收模块 2 确定模块1 voice receiving
3 语音识别模块 4 显示模块3 speech recognition module 4 display module
以下结合附图对本发明实施例的具体实施方式进行详细说明。应当理解的是,此处所描述的具体实施方式仅用于说明和解释本发明实施例,并不用于限制本发明实施例。The specific implementations of the embodiments of the present invention will be described in detail below with reference to the drawings. It should be understood that the specific implementation manners described herein are only used to illustrate and explain the embodiments of the present invention, and are not intended to limit the embodiments of the present invention.
本发明实施例的一个方面提供一种用于辅助听障人士进行交流的方法。图1是本发明一实施例提供的用于辅助听障人士进行交流的方法的流程图。如图1所示,该方法包括以下步骤。An aspect of an embodiment of the present invention provides a method for assisting a hearing impaired person to communicate. FIG. 1 is a flowchart of a method for assisting a hearing impaired person to communicate according to an embodiment of the present invention. As shown in FIG. 1, the method includes the following steps.
在步骤S10中,接收至少一个声源的语音。In step S10, the voice of at least one sound source is received.
在步骤S11中,基于接收的至少一个声源中的每一声源的语音,确定至少一个声源中的每一声源的方位。In step S11, the position of each sound source in the at least one sound source is determined based on the speech of each sound source in the received at least one sound source.
其中,确定声源的方位可以是基于接收到从声源发出的语音的时间。例如,用于接收至少一个声源的语音的语音接收模块包括多个语音采集模块,该多个语音采集模块被设置在不同的位置,多个语音采集模块接收到从同一声源发出的语音的时间不同。针对至少一个声源中的每一声源,根据语音到达多个语音采集模块的时间的不同,即,根据语音到达多个语音采集模块的时间差,确定声源的方位。可选地,在本发明实施例中,语音采集模块可以是麦克风,该语音接收模块可以是麦克风阵列。例如,麦克风阵列可以包括2、4、6、7或8个麦克风。Wherein, determining the position of the sound source may be based on the time when the voice emitted from the sound source is received. For example, a voice receiving module for receiving voice of at least one sound source includes a plurality of voice acquisition modules, the plurality of voice acquisition modules are disposed at different positions, and the plurality of voice acquisition modules receives voices from the same sound source. Time is different. For each sound source in the at least one sound source, the position of the sound source is determined according to the difference in the time when the speech arrives at the multiple speech collection modules, that is, according to the time difference between the time when the speech reaches the multiple speech collection modules. Optionally, in the embodiment of the present invention, the voice collection module may be a microphone, and the voice receiving module may be a microphone array. For example, the microphone array may include 2, 4, 6, 7, or 8 microphones.
可选地,在本发明实施例中,方位的基准点可以根据实际情况进行设置,例如,可以是语音接收模块所在的位置。具体地,可以是多个语音采集模块中的任一语音采集模块,或者还可以是多个语音采集模块的中间位置。另外,当语音接收模块被听障人士佩戴或者距离听障人士的距离不远时,以语音接收模块为基准点,实际即以听障人士为基准点,如此,听障 人士可以基于确定出的方位了解声源相对于自己的位置。Optionally, in the embodiment of the present invention, the reference point of the azimuth may be set according to the actual situation, for example, it may be the position where the voice receiving module is located. Specifically, it may be any one of the plurality of voice acquisition modules, or may be an intermediate position of the plurality of voice acquisition modules. In addition, when the speech receiving module is worn by the hearing-impaired person or is not far from the hearing-impaired person, the speech-receiving module is used as the reference point, which is actually the hearing-impaired person as the reference point. In this way, the hearing-impaired person can Azimuth knows where the sound source is relative to itself.
可选地,在本发明实施例中,方位可以包括方向和/或距离。可选地,在本发明实施例中,可以采用箭头表示方向,箭头位于一圆周划定的区域内,箭头的起点为该圆周的原点,其中,该原点相当于听障人士所在位置,箭头偏离穿过该圆周的纵轴一角度,如图2所示,图中竖向虚线为穿过圆周的纵轴。此外,以圆周的横轴为基准,如图2所示的横向虚线所示,当箭头位于横轴以上的部分时,表示声源在听障人士的前方;当箭头位于横轴以下的部分时,表示声源在听障人士的后方。例如,以如2所示的方向示例为例,该箭头表示的声源在听障人士的前方。另外,该以箭头表示声源的方向的方式还可以解读为采用时钟来表示方向。其中圆周代表表盘,位于圆周的横轴的上半部分的纵轴表示12点钟方向,根据箭头偏离12点钟的角度确定声源大概在几点钟方向。以图2所示的方向示例为例,箭头表示的声源大概在10点钟方向。另外,在方位包括方向和距离的情况下,可以采用如图3所示的示例表示方位。需要说明的是,显示距离的位置可以根据实际情况进行设定,对此,不进行限制。此外,在方位仅包括距离的情况下,可以仅显示距离。特别地,当基于接收到的语音确定声源来自于听障人士本人时,采用箭头表示声源的方向时,在圆周中心显示“O”或者“●”来表示声源的方向。另外,在本发明实施例中,还可以采用文字描述方位,例如,以图3所示的方位为例,可以显示文字“方向为十点钟方向,距离为50cm”。Optionally, in the embodiment of the present invention, the orientation may include a direction and / or a distance. Optionally, in the embodiment of the present invention, an arrow may be used to indicate the direction. The arrow is located in an area delimited by a circle. The starting point of the arrow is the origin of the circle. The origin is equivalent to the position of the hearing impaired and the arrow deviates. The vertical axis passing through the circle is at an angle, as shown in FIG. 2, and the vertical dotted line in the figure is the vertical axis passing through the circle. In addition, the horizontal axis of the circle is used as a reference. As shown by the horizontal dashed line in FIG. 2, when the arrow is located above the horizontal axis, it means that the sound source is in front of the hearing impaired; when the arrow is located below the horizontal axis, Indicates that the sound source is behind the hearing impaired. For example, taking the direction example shown in 2 as an example, the sound source indicated by the arrow is in front of the hearing impaired person. In addition, the way in which the direction of the sound source is indicated by arrows can also be interpreted as the direction is indicated by a clock. The circle represents the dial, and the vertical axis located in the upper half of the horizontal axis of the circle indicates the 12 o'clock direction. According to the angle at which the arrow deviates from 12 o'clock, it is determined that the sound source is about the clock direction. Taking the direction example shown in Figure 2 as an example, the sound source indicated by the arrow is about 10 o'clock. In addition, in a case where the azimuth includes a direction and a distance, the azimuth can be expressed using an example as shown in FIG. 3. It should be noted that the position of the display distance can be set according to the actual situation, and there is no limitation on this. In addition, in a case where the azimuth includes only the distance, only the distance may be displayed. In particular, when it is determined that the sound source comes from the hearing impaired person based on the received voice, when an arrow is used to indicate the direction of the sound source, "O" or "●" is displayed at the center of the circle to indicate the direction of the sound source. In addition, in the embodiment of the present invention, text may also be used to describe the orientation. For example, taking the orientation shown in FIG. 3 as an example, the text "direction is ten o'clock and distance is 50 cm" may be displayed.
在步骤S12中,识别至少一个声源中的每一声源的语音,以将至少一个声源中的每一声源的语音转化成文字。例如,通过语音识别技术来实现将语音转化成文字。In step S12, the speech of each sound source in the at least one sound source is identified to convert the speech of each sound source in the at least one sound source into text. For example, speech recognition technology is used to convert speech into text.
在步骤S13中,显示至少一个声源中的每一声源的方位及语音所转化的文字。其中,显示每一声源的方位及语音转化的文字的示例可以如图4所示。此外,在显示某一声源的语音转化的文字时,若一行不能显示完全 所有文字,则可以自动换行显示,或者可以滚动显示。In step S13, the position of each of the at least one sound source and the text converted by the voice are displayed. An example showing the position of each sound source and the text converted by voice can be shown in FIG. 4. In addition, when displaying the speech-transformed text of a certain sound source, if all the text cannot be displayed in one line, it can be displayed in a new line or scrolled.
另外,需要说明的是,图4仅以示例的方式展示了显示方位及语音转化的文字的区域的位置,该两者的显示区域的位置可以根据实际情况进行选择,对于该两者的显示区域的位置不进行限定。此外,显示至少一个声源中的每一声源的方位及语音转化的文字时均可以采用图4所示的示例进行显示。另外,显示多个声源的方位及语音转化的文字时,可以按照声源上下依次排列的方式进行显示,如图5所示。此外,也可以按照声源左右依次排列的方式进行显示,或者采用其他的排列方式进行显示,对此,不进行限制。In addition, it should be noted that FIG. 4 only shows by way of example the position of the area where the orientation and the text converted by voice are displayed. The position of the two display areas can be selected according to the actual situation. For the two display areas, The position is not limited. In addition, when displaying the position of each of the at least one sound source and the text converted by speech, the example shown in FIG. 4 may be used for display. In addition, when displaying the positions of multiple sound sources and the text converted by voice, the sound sources may be displayed in a manner of being arranged one above the other in sequence, as shown in FIG. 5. In addition, the sound sources may be displayed in the order of the left and right of the sound sources, or displayed in other arrangements, which is not limited.
可选地,在本发明实施例中,显示方位和文字的方式可以有很多种。例如,采用预设前景色与预设背景色进行显示,其中预设前景色与预设背景色为不同种颜色。比如,预设前景色为白色,预设背景色为黑色,显示黑底白字;或者预设前景色为黑色,预设背景色为白色,显示白底黑字。再比如,预设前景色为白色,预设背景色为绿色,显示绿底白字;或者预设前景色为绿色,预设背景色为白色,显示白底绿字。如此,使得用户可以更加清楚的区分出文字。例如,显示方位和文字的方式还可以是采用预设前景色与预设背景色交替变换颜色的方式显示不同声源对应的方位和文字,即,预设前景色和预设背景色为不同种颜色,根据接收语音的顺序,当所接收的相邻语音所对应的声源为不同声源时,交替变化预设前景色和预设背景色;当所接收的相邻语音所对应的声源为同一声源时,预设前景色和预设背景色不变化颜色。其中,对于预设前景色和预设背景色的颜色,可以根据实际情况进行限定,例如,预设前景色为白色,预设背景色为黑色,显示黑底白字;或者预设前景色为黑色,预设背景色为白色,显示白底黑字;再比如,预设前景色为白色,预设背景色为绿色,显示绿底白字;或者预设前景色为绿色,预设背景色为白色,显示白底绿字。下面示例性地以预设前景色为白色、预设背景色为绿色介绍交替变换预设前景色与预 设背景色显示不同声源对应的方位和文字。若某一语音(命名为第一语音,该命名仅为便于叙述,无限定作用)对应于第一声源,显示第一语音转化的文字和第一声源的方位时采用绿底白字;根据接收语音的顺序,第一语音的下一条语音(命名为第二语音)对应的声源与第一声源为不同声源(命名第二语音对应的声源为第二声源),显示第二语音转化的文字和第二声源的方位时采用白底绿字;根据接收语音的顺序,第二语音的下一条语音(命名为第三语音)对应的声源为第二声源,则在显示第三语音转化的文字及第二声源的方位时依旧采用白底绿字;根据接收语音的顺序,第三语音的下一条语音(命名为第四语音)对应的声源与第二声源为不同声源(命名第四语音对应的声源为第三声源,其中第三声源可以是第一声源,也可以是其他声源,只要不是第二声源即可),则在显示第四语音转化的文字及第三声源的方位时采用绿底白字,如此,循环下去,直到所接收到的语音对应的信息全部显示完毕,其中语音对应的信息包括语音转换的文字及语音对应的声源的方位。Optionally, in the embodiment of the present invention, there may be many ways to display the orientation and text. For example, the preset foreground color and the preset background color are used for display, and the preset foreground color and the preset background color are different colors. For example, the preset foreground color is white and the preset background color is black to display white characters on a black background; or the preset foreground color is black and the preset background color is white to display black characters on a white background. For another example, the preset foreground color is white, the preset background color is green, and the white text on a green background is displayed; or the preset foreground color is green, the preset background color is white, and the green text on a white background is displayed. In this way, the user can distinguish the text more clearly. For example, the way to display the orientation and text may also be to display the orientation and text corresponding to different sound sources by alternately changing the color of the preset foreground color and the preset background color, that is, the preset foreground color and the preset background color are different. Color, according to the order of received speech, when the sound source corresponding to the received adjacent speech is a different sound source, the preset foreground color and the preset background color are alternately changed; when the sound source corresponding to the received adjacent speech is the same For sound sources, the preset foreground color and preset background color do not change colors. The colors of the preset foreground color and the preset background color can be limited according to actual conditions. For example, the preset foreground color is white, the preset background color is black, and the white text on a black background is displayed; or the preset foreground color is black. , The preset background color is white, and black characters on a white background are displayed; for example, the preset foreground color is white, the preset background color is green, and white characters on a green background are displayed; or the preset foreground color is green, and the preset background color is white Displays green text on a white background. The following is an example of using the preset foreground color to be white and the preset background color to be green to alternately change the preset foreground color and the preset background color to display the orientation and text corresponding to different sound sources. If a certain voice (named as the first voice, the name is only for narrative purposes, without limitation) corresponds to the first sound source, and the text converted by the first voice and the orientation of the first sound source are displayed in white on a green background; according to In the order of receiving voices, the sound source corresponding to the next voice of the first voice (named second voice) and the first sound source are different sound sources (the sound source corresponding to the second voice is named second sound source). The text converted by the second voice and the position of the second sound source are in green on a white background. According to the order of the received voice, the sound source corresponding to the next voice of the second voice (named the third voice) is the second sound source. When displaying the text converted from the third voice and the orientation of the second sound source, the green text on a white background is still used. According to the order of the received voice, the sound source corresponding to the next voice of the third voice (named the fourth voice) and the second voice The sound source is a different sound source (the fourth sound source is named the third sound source, and the third sound source may be the first sound source or other sound sources, as long as it is not the second sound source), The text of the fourth voice conversion and the third sound source are displayed Azimuth using green and white, so, the cycle continues until all of the received display information corresponding to the voice is completed, wherein the information corresponding to the speech sound source and text orientation including voice corresponding to speech conversion.
显示至少一个声源中的每一声源的语音转化的文字使得听障人士可以明白每一声源的语音的内容;显示至少一个声源中的每一声源的方位,实现了以简单、直观的方式提示听障人士语音的发声位置。如此,使得听障人士在观看“字幕”以理解收听到的语音信息的同时获取类似于常人的对位置的感知,实现了使得听障人士可以明白每个声源发出的语音的内容的同时能够清楚、直观的了解每个声源的方位,如此,使得听障人士准确把握每个声源发出的语音的内容及声源的方位,便于听障人士与他人的沟通和交流。特别地,在存在多个声源的环境下,准确把握每个声源发出的语音的内容及声源的方位,非常有助于听障人士与他人的交流。此外,采用本发明实施例中所述的方法,用户操作体验极轻,完全无需操作技术系统就能够“听”到其能力所不及的信息。另外,需要说明的是,本发明实施例提供的用于辅助听障人士进行交流的方法不仅可以适用于听障人士,也 适用于普通人士。Display the voice-transformed text of each sound source in at least one sound source so that the hearing impaired person can understand the content of the speech of each sound source; display the position of each sound source in at least one sound source, in a simple and intuitive way Prompt the sounding position of the hearing impaired. In this way, the hearing-impaired person can obtain the positional awareness similar to ordinary people while watching the "subtitles" to understand the voice information being heard, thereby enabling the hearing-impaired person to understand the content of the voice emitted by each sound source. Clearly and intuitively understand the position of each sound source, so that the hearing impaired can accurately grasp the content of the sound emitted by each sound source and the position of the sound source, which is convenient for the hearing impaired to communicate and communicate with others. In particular, in an environment where there are multiple sound sources, accurately grasping the content of the sound emitted by each sound source and the position of the sound source is very helpful for the communication of hearing impaired people with others. In addition, with the method described in the embodiment of the present invention, the user's operation experience is extremely light, and he / she can “hear” information beyond his ability without operating a technical system at all. In addition, it should be noted that the method for assisting the hearing-impaired person provided by the embodiment of the present invention can be applied not only to the hearing-impaired person, but also to ordinary people.
图6是本发明另一实施例提供的用于辅助听障人士进行交流的方法的流程图。与图1所示的方法的不同之处在于,图6所示的方法还包括以下内容。FIG. 6 is a flowchart of a method for assisting a hearing impaired person to communicate according to another embodiment of the present invention. The difference from the method shown in FIG. 1 is that the method shown in FIG. 6 further includes the following content.
在步骤S64中,接收文字。其中,听障人士输入文字的方式有很多。例如,连接键盘,使得听障人士通过键盘输入文字。例如,连接交互界面,听障人士可以通过交互界面输入文字。此外,还可以连接客户端,听障人士通过客户端输入文字。可选地,该客户端可以是手机APP。In step S64, a character is received. Among them, there are many ways for the hearing impaired to enter text. For example, connecting a keyboard enables a hearing impaired person to enter text through the keyboard. For example, by connecting an interactive interface, a hearing impaired person can enter text through the interactive interface. In addition, you can connect to the client, and the hearing impaired can enter text through the client. Optionally, the client may be a mobile APP.
在步骤S65中,将所接收的文字转化为语音,例如,通过TTS技术实现文字到语音的转化。In step S65, the received text is converted into speech, for example, text-to-speech conversion is achieved by using TTS technology.
在步骤S66中,播放所转化的语音。In step S66, the converted voice is played.
如此,当听障人士没有发音能力或者发音能力受限时,使得听障人士可以通过输入文字来表达其意思,与他人进行交流。In this way, when the hearing-impaired person does not have the pronunciation ability or the pronunciation ability is limited, the hearing-impaired person can input text to express his meaning and communicate with others.
需要说明的是,步骤S64-步骤S66也可以在步骤S60-步骤S63之前,对此,不进行限制。It should be noted that steps S64 to S66 may also be performed before steps S60 to S63, and there is no restriction on this.
可选地,在本发明实施例中,在接收到至少一个声源的语音和/或接收文字之前,该用于辅助听障人士进行交流的方法还可以包括以下内容:接收对语音所转化成的文字的语言和/或文字所转化成的语音的语言的设定。在该实施例中,“听障人士”可能并非是真正的听力能力受限制的人士,可以是不懂与之交流的他人的语言的“第一视同听障人士”,或者是与之交流的他人不同其语言的“第二视同听障人士”。设定“第一视同听障人士”使用的第一语言,将接收到至少一个声源的语音转化成采用第一语言表达的文字,“第一视同听障人士”通过看转化后的文字明白与之交流的他人讲话的内容。设定与“第二视同听障人士”进行交流的他人使用的第二语言,将“第二视同听障人士”输入的文字转化为采用第二语言进行表达的语音,他人可以通过听语音明白“视同第二听障人士”所要表达的意思。如此, 实现了“视同听障人士”与他人之间的交流。Optionally, in the embodiment of the present invention, before the voice of at least one sound source is received and / or the text is received, the method for assisting a hearing impaired person to communicate may further include the following content: The language of the text and / or the language of the speech into which the text is converted. In this embodiment, the “hearing impaired person” may not be a person with limited hearing ability, but may be a “first-view equal hearing impaired person” who does not understand the language of others Of people who are different in their language are "second-view parity hearing impaired." Set the first language used by the “first-view hearing-impaired person” to convert the speech received by at least one sound source into words expressed in the first language. The text understands what others are talking to. Set a second language used by others who communicate with the “second-view hearing-impaired person”, and convert the text entered by the “second-view hearing-impaired person” into a voice expressed in a second language. The voice understands what "as a second hearing impaired" means. In this way, communication between the “persons with hearing impairment” and others is realized.
可选地,在本发明实施例中,该用于辅助听障人士进行交流的方法还可以包括以下内容:确定听障人士的位置信息;以及向移动终端和/或客户端发送位置信息,以使得移动终端和/或客户端实时获取位置信息。如此,使得与听障人士的相关的联系人可以通过实时获取听障人士的位置信息,以确认其是否安全,并且可以在出现状况时尽快找到他。其中,在本发明实施例中,可以通过GPS定位技术实时确定听障人士的位置信息。Optionally, in the embodiment of the present invention, the method for assisting the hearing impaired to communicate may further include the following: determining the position information of the hearing impaired; and sending the position information to the mobile terminal and / or the client to The mobile terminal and / or the client can obtain the location information in real time. In this way, the contact person related to the hearing impaired person can obtain the position information of the hearing impaired person in real time to confirm whether he is safe, and can find him as soon as possible when a situation occurs. Among them, in the embodiment of the present invention, the position information of the hearing-impaired person can be determined in real time through GPS positioning technology.
可选地,在本发明实施例中,在向移动终端和/或客户端发送位置信息之前,该方法还包括:接收对联系人的设定,其中移动终端和/或客户端为与所选定的联系人对应的移动终端和/或客户端。当与听障人士相关的联系人有很多时,在不同的情况下,有的联系人可以在听障人士出现困难时,及时出现在听障人士的身边,以帮助其解决困难,由此,在发送听障人士的位置信息时,可以直接向能及时出现的联系人发送听障人士的位置信息,以使得在听障人士出现困难时,尽快到达听障人士所在地,帮助听障人士解决困难。此外,可以预先设定好与听障人士相关的联系人及其所使用的移动终端和/或客户端的对应关系。Optionally, in the embodiment of the present invention, before sending the location information to the mobile terminal and / or the client, the method further includes: receiving a setting of a contact, wherein the mobile terminal and / or the client are connected with the selected The mobile terminal and / or client corresponding to the specified contact. When there are many contacts related to the hearing impaired, in different situations, some contacts can appear beside the hearing impaired in time to help the hearing impaired, thereby, When sending the hearing-impaired person's location information, the hearing-impaired person's location information can be sent directly to the contacts who can appear in time, so that when the hearing-impaired person has difficulty, he can reach the hearing-impaired person's location as soon as possible, helping the hearing-impaired person to solve the problem . In addition, the correspondence relationship between the hearing-impaired persons and the mobile terminals and / or clients used by them can be set in advance.
此外,在本发明实施例中,该用于辅助听障人士进行交流的方法还可以包括以下内容:根据接收语音的顺序,记录各个声源对应的方位和文字,将各个声源对应的方位和文字存储在本地端或云端,以进一步帮助听障人士的记忆及事后分享。In addition, in the embodiment of the present invention, the method for assisting the hearing impaired to communicate may further include the following: according to the order of receiving voices, recording the positions and characters corresponding to each sound source, and comparing the positions and corresponding positions of each sound source The text is stored locally or in the cloud to further help the hearing impaired to remember and share afterwards.
相应地,本发明实施例的另一方面提供一种用于辅助听障人士进行交流的装置。图7是本发明另一实施例提供的用于辅助听障人士进行交流的装置。如图7所示,该装置包括语音接收模块1、确定模块2、语音识别模块3和显示模块4。其中,语音接收模块1用于接收至少一个声源的语音。确定模块2用于基于接收的至少一个声源中的每一声源的语音,确定至少一个声源中的每一声源的方位。语音识别模块3用于识别至少一个声源中 的每一声源的语音,以将至少一个声源中的每一声源的语音转化成文字。显示模块4用于显示至少一个声源中的每一声源的方位及语音所转化的文字。Accordingly, another aspect of the embodiments of the present invention provides a device for assisting a hearing impaired person to communicate. FIG. 7 is a device for assisting people with hearing impairment to communicate according to another embodiment of the present invention. As shown in FIG. 7, the device includes a
显示至少一个声源中的每一声源的语音转化的文字使得听障人士可以明白每一声源的语音的内容;显示至少一个声源中的每一声源的方位,实现了以简单、直观的方式提示听障人士语音的发声位置。如此,使得听障人士在观看“字幕”以理解收听到的语音信息的同时获取类似于常人的对位置的感知,实现了使得听障人士可以明白每个声源发出的语音的内容的同时能够清楚、直观的了解每个声源的方位,如此,使得听障人士准确把握每个声源发出的语音的内容及声源的方位,便于听障人士与他人的沟通和交流。特别地,在存在多个声源的环境下,准确把握每个声源发出的语音的内容及声源的方位,非常有助于听障人士与他人的交流。另外,需要说明的是,本发明实施例提供的用于辅助听障人士进行交流的方法不仅可以适用于听障人士,也适用于普通人士。Display the voice-transformed text of each sound source in at least one sound source so that the hearing impaired person can understand the content of the speech of each sound source; display the position of each sound source in at least one sound source, in a simple and intuitive way Prompt the sounding position of the hearing impaired. In this way, the hearing-impaired person can obtain the positional awareness similar to ordinary people while watching the "subtitles" to understand the voice information being heard, thereby enabling the hearing-impaired person to understand the content of the voice emitted by each sound source. Clearly and intuitively understand the position of each sound source, so that the hearing impaired can accurately grasp the content of the sound emitted by each sound source and the position of the sound source, which is convenient for the hearing impaired to communicate and communicate with others. In particular, in an environment where there are multiple sound sources, accurately grasping the content of the sound emitted by each sound source and the position of the sound source is very helpful for the communication of hearing impaired people with others. In addition, it should be noted that the method for assisting the hearing impaired to provide communication provided by the embodiment of the present invention can be applied not only to the hearing impaired, but also to ordinary people.
可选地,在本发明实施例中,该用于辅助听障人士进行交流的装置还包括文字接收模块,用于接收文字;文字转化模块,用于将所接收的文字转化为语音;以及语音播放模块,用于播放所转化的语音。Optionally, in the embodiment of the present invention, the device for assisting the hearing impaired to communicate further includes a text receiving module for receiving text; a text conversion module for converting the received text into speech; and speech A playback module for playing the converted voice.
可选地,在本发明实施例中,显示模块可以是近眼显示器。其中,该近眼显示器距离眼球的距离可以小于2cm。此外,近眼显示器可以包括可透视的近眼显示器或不可透视的近眼显示器。如此,实现了将每一声源的方位和发出的语音转化的文字呈现在眼前。优选地,在本发明实施例中,显示模块可以是透视式近眼显示器。如此,实现了在不影响听障人士观察其他事物的同时,使得听障人士可以通过观看“字幕”了解每一声源的方位和发出的语音转化的文字。Optionally, in the embodiment of the present invention, the display module may be a near-eye display. The distance between the near-eye display and the eyeball may be less than 2 cm. In addition, the near-eye display may include a see-through near-eye display or a non-see-through near-eye display. In this way, the position of each sound source and the text converted by the voice are presented in front of the eyes. Preferably, in the embodiment of the present invention, the display module may be a see-through near-eye display. In this way, while not affecting the hearing-impaired person's observation of other things, the hearing-impaired person can understand the position of each sound source and the text converted by the voice by watching "subtitles".
可选地,在本发明实施例中,该装置还包括:定位模块,用于确定听障人士的位置信息;以及通信模块,用于向移动终端和/或客户端发送位置 信息,以使得移动终端和/或客户端实时获取所述位置信息。Optionally, in the embodiment of the present invention, the device further includes: a positioning module for determining the location information of the hearing impaired; and a communication module for sending the location information to the mobile terminal and / or the client so that the mobile The terminal and / or the client obtains the location information in real time.
可选地,在本发明实施例中,该装置还包括:联系人设定模块,用于在通信模块向移动终端和/或客户端发送所述位置信息之前,接收对联系人的设定,其中移动终端和/或客户端为与所选定的联系人对应的移动终端和/或客户端。Optionally, in the embodiment of the present invention, the device further includes: a contact setting module, configured to receive the setting of the contact before the communication module sends the location information to the mobile terminal and / or the client, The mobile terminal and / or client is a mobile terminal and / or client corresponding to the selected contact.
此外,在本发明实施例中,该用于辅助听障人士进行交流的装置还包括存储模块。该存储模块用于根据接收语音的顺序,记录各个声源对应的方位和文字,以进一步帮助听障人士的记忆及事后分享。其中,该存储模块记录各个声源对应的方位和文字可以是将各个声源对应的方位和文字存储在本地端或云端。In addition, in the embodiment of the present invention, the device for assisting a hearing-impaired person to communicate further includes a storage module. The storage module is used to record the position and text corresponding to each sound source according to the order of receiving voices, to further help the hearing impaired to remember and share afterwards. Wherein, the storage module records the orientation and text corresponding to each sound source, which may be storing the orientation and text corresponding to each sound source on the local end or in the cloud.
本发明实施例提供的用于辅助听障人士进行交流的装置的具体工作原理及益处与本发明实施例提供的用于辅助听障人士进行交流的方法的具体工作原理及益处相似,这里将不再赘述。The specific working principle and benefits of the device for assisting the hearing impaired to provide communication provided by the embodiment of the present invention are similar to the specific working principle and benefits of the method for assisting the hearing impaired to provide communication provided by the embodiment of the present invention. More details.
另外,本发明实施例的另一方面提供一种用于辅助听障人士进行交流的系统,该系统包括:上述实施例中所述的装置以及客户端。其中,该客户端可以接收用户输入的文字;和/或可以接收听障人士的位置信息。In addition, another aspect of the embodiments of the present invention provides a system for assisting a hearing impaired person to communicate. The system includes: the device described in the foregoing embodiment and a client. The client can receive text input by the user; and / or can receive location information of the hearing impaired.
此外,本发明实施例的另一方面提供一种增强现实眼镜。该增强现实眼镜包括上述实施例中所述的装置。In addition, another aspect of the embodiments of the present invention provides an augmented reality glasses. The augmented reality glasses include the devices described in the above embodiments.
其中,该增强现实眼镜包括支持上述实施例中所述的装置运行的电子电路系统,该电子电路系统包括电源、处理器、网络连接等模块以及语音接收模块、文字接收模块和语音播放模块。此外,该电子电路系统还可以包括外部可见的人-机界面模块以及按钮和/或触摸控制板。其中,处理器包括上述实施例中所述的确定模块、语音识别模块和文字转化模块。人-机界面模块包括显示模块。处理器还可以实现在本地进行离线语音识别,也可以实现经由网络连接在云端进行在线语音识别。The augmented reality glasses include an electronic circuit system that supports the operation of the device described in the foregoing embodiment. The electronic circuit system includes a power source, a processor, a network connection module, and a voice receiving module, a text receiving module, and a voice playing module. In addition, the electronic circuit system may further include an externally visible human-machine interface module and buttons and / or a touch control panel. The processor includes the determination module, the speech recognition module, and the text conversion module described in the foregoing embodiments. The human-machine interface module includes a display module. The processor can also perform offline speech recognition locally, or online speech recognition in the cloud via a network connection.
可选地,在本发明实施例中,触摸控制板、按钮和/或语音接收模块可 以被设置在增强现实眼镜的眼镜或者眼镜附件上,例如,设置在镜腿、镜框或者镜片上。可选地,在本发明实施例中,语音接收模块可以被设置在镜框上、同一镜腿上或者不同镜腿上,或者是接近于耳部(双耳或单耳)的位置上,达到极尽拟合耳部的效果。例如,在语音接收模块为麦克风阵列且麦克分阵列包括两个麦克风的情况下,该两个麦克风分别设置两个镜框上,或者被设置在同一镜腿的不同位置上,或者被分别设置在两个镜腿上。当麦克风阵列包括的麦克风的数量大于2时,也可以根据实际情况将多个麦克风分别设置在镜框和/或镜腿上等。另外,使用麦克风阵列时,语音到达麦克风阵列中的每个麦克风的时间和强度均存在差异,通过对差异进行计算可以得到更加便于处理的清晰声音。此外,相比于采用单体麦克风或者降噪麦克风,使用麦克风阵列具有十分重要的意义,使用麦克风阵列可以不要求声源距离语音接收模块的距离。并且,使用麦克风阵列可以适应各种距离,能够满足多数交流场景下的要求,其中,该距离指的是声源距离麦克风阵列的距离。例如可以满足以下交流场景的要求:两人单独对话,声源距离语音接收模块的距离在50cm与1m之间;多人小组对话,声源距离语音接收模块的距离在1m与2m之间;会议,声源距离语音接收模块的距离为3m;上课,声源距离语音接收模块的距离在3m到5m,等等。Optionally, in the embodiment of the present invention, the touch control panel, the buttons, and / or the voice receiving module may be provided on the glasses or glasses accessories of the augmented reality glasses, for example, on the temples, frames, or lenses. Optionally, in the embodiment of the present invention, the voice receiving module may be disposed on the frame, on the same temple or on a different temple, or at a position close to the ears (both ears or monaural), to reach the pole. Try to fit the ear. For example, when the voice receiving module is a microphone array and the microphone sub-array includes two microphones, the two microphones are respectively disposed on two frames, or are disposed on different positions of the same temple, or are respectively disposed on two On the temples. When the number of microphones included in the microphone array is greater than two, a plurality of microphones may also be respectively arranged on the frame and / or the temple according to the actual situation. In addition, when using a microphone array, there is a difference in the time and intensity of speech reaching each microphone in the microphone array. By calculating the difference, a clearer sound that is more convenient to process can be obtained. In addition, compared to using a single microphone or a noise reduction microphone, the use of a microphone array is of great significance. Using a microphone array does not require the distance of the sound source from the voice receiving module. In addition, the use of the microphone array can adapt to various distances and can meet the requirements in most communication scenarios. The distance refers to the distance between the sound source and the microphone array. For example, it can meet the requirements of the following communication scenarios: two people talk individually, the distance between the sound source and the voice receiving module is between 50cm and 1m; in a multi-person group conversation, the sound source is between the distance of 1m and 2m from the voice receiving module; conference , The distance between the sound source and the voice receiving module is 3m; in class, the distance between the sound source and the voice receiving module is 3m to 5m, and so on.
此外,在显示模块为近眼显示器的情况下,实现了将每一声源的方位和发出的语音转化的文字呈现在眼前。其中,近眼显示器可以是可透视的,也可以是不可透视的。进一步地,在近眼显示器为透视式近眼显示器的情况下,实现了在不影响听障人士观察显示场景的同时,透过叠加于现实场景的图示化指示,使得听障人士可以实时看到每一声源的方位和发出的语音转化的文字,使得听障人士在观看“字幕”以理解收听到的语音信息的并获取类似于常人的对位置的感知。此外,考虑到避免听障人士注意力分散,近眼显示器可以是单色显示器,采用预设背景色和预设前景色显示声源对应的方位及文字。另外,近眼显示器也可以是彩色显示器,采用背景 色和前景色交替变换的形式显示不同声源对应的方位和文字,具体变换方式可以参见上述实施例中所述的内容,如此,也可以充分避免听障人士注意力分散,使得听障人士专注于内容本身;同时使得听障人士可以进行正常的实景交流,而不会产生被打断及需要转换注意力焦点的不适。In addition, in the case where the display module is a near-eye display, the orientation of each sound source and the text converted by the voice are presented in front of the eyes. Among them, the near-eye display may be transparent or non-transparent. Further, in the case that the near-eye display is a see-through near-eye display, it does not affect the hearing-impaired person's observation of the display scene, and through the graphical instructions superimposed on the real scene, the hearing-impaired person can see each The position of a sound source and the text converted by the voice makes the hearing impaired person watching "subtitles" to understand the voice information heard and obtain the position perception similar to ordinary people. In addition, in order to avoid distraction of the hearing impaired, the near-eye display may be a monochrome display, which uses a preset background color and a preset foreground color to display the orientation and text corresponding to the sound source. In addition, the near-eye display can also be a color display. The background color and foreground color are alternately displayed to display the orientation and text corresponding to different sound sources. For specific conversion methods, refer to the content described in the foregoing embodiment. In this way, it can also be fully avoided. Hearing-impaired people are distracted, allowing them to focus on the content itself; at the same time, they allow normal hearing exchanges without the discomfort of being interrupted and needing to change their focus.
另外,本发明实施例的另一方面还提供一种机器可读存储介质,该机器可读存储介质上存储有指令,该指令用于使得机器执行上述实施例中所述的方法。In addition, another aspect of the embodiments of the present invention further provides a machine-readable storage medium, where the machine-readable storage medium stores instructions, and the instructions are used to cause a machine to execute the method described in the foregoing embodiments.
综上所述,显示至少一个声源中的每一声源的语音转化的文字使得听障人士可以明白每一声源的语音的内容;显示至少一个声源中的每一声源的方位,实现了以简单、直观的方式提示听障人士语音的发声位置。如此,使得听障人士在观看“字幕”以理解收听到的语音信息的同时获取类似于常人的对位置的感知,实现了使得听障人士可以明白每个声源发出的语音的内容的同时能够清楚、直观的了解每个声源的方位,如此,使得听障人士准确把握每个声源发出的语音的内容及声源的方位,便于听障人士与他人的沟通和交流。特别地,在存在多个声源的环境下,准确把握每个声源发出的语音的内容及声源的方位,非常有助于听障人士与他人的交流。将听障人士输入的文字转化成语音并播放转化的语音,如此,当听障人士没有发音能力或者发音能力受限时,使得听障人士可以通过输入文字来表达其意思,与他人进行交流。此外,将接收的语音转化成使用“视同听障人士”使用的语言表达的文字和/或将“视同听障人士”输入的文字转化成使用与“视同听障人士”沟通的他人使用的语言表达的语音,如此,实现了“视同听障人士”与他人之间的交流。To sum up, displaying the translated text of each sound source in at least one sound source enables the hearing impaired person to understand the content of each sound source's speech; displaying the position of each sound source in at least one sound source, achieving A simple and intuitive way to indicate where the voice of the hearing impaired occurs. In this way, the hearing-impaired person can obtain the positional awareness similar to ordinary people while watching the "subtitles" to understand the voice information being heard, thereby enabling the hearing-impaired person to understand the content of the voice emitted by each sound source. Clearly and intuitively understand the position of each sound source, so that the hearing impaired can accurately grasp the content of the sound emitted by each sound source and the position of the sound source, which is convenient for the hearing impaired to communicate and communicate with others. In particular, in an environment where there are multiple sound sources, accurately grasping the content of the sound emitted by each sound source and the position of the sound source is very helpful for the communication of hearing impaired people with others. The text input by the hearing impaired is converted into speech and the converted speech is played. In this way, when the hearing impaired has no pronunciation ability or the pronunciation ability is limited, the hearing impaired person can express his meaning by entering text and communicate with others. In addition, the received speech is converted into words expressed in the language used by the "deaf hearing impaired" and / or the words entered by the "deaf hearing impaired" are converted into others who communicate with the "deaf The speech expressed in the language used thus realizes the communication between the "persons with hearing impairment" and others.
以上结合附图详细描述了本发明实施例的可选实施方式,但是,本发明实施例并不限于上述实施方式中的具体细节,在本发明实施例的技术构思范围内,可以对本发明实施例的技术方案进行多种简单变型,这些简单变型均属于本发明实施例的保护范围。The optional implementations of the embodiments of the present invention have been described in detail above with reference to the accompanying drawings. However, the embodiments of the present invention are not limited to the specific details in the foregoing implementations. Within the scope of the technical concept of the embodiments of the present invention, the embodiments of the present invention may be implemented. The technical solution of the present invention performs various simple modifications, and these simple modifications all belong to the protection scope of the embodiments of the present invention.
另外需要说明的是,在上述具体实施方式中所描述的各个具体技术特征,在不矛盾的情况下,可以通过任何合适的方式进行组合。为了避免不必要的重复,本发明实施例对各种可能的组合方式不再另行说明。In addition, it should be noted that the specific technical features described in the foregoing specific embodiments can be combined in any suitable manner without conflict. In order to avoid unnecessary repetition, the embodiments of the present invention do not separately describe various possible combinations.
本领域技术人员可以理解实现上述实施例方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序存储在一个存储介质中,包括若干指令用以使得单片机、芯片或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。Those skilled in the art can understand that all or part of the steps in the method of the above embodiments can be completed by a program instructing related hardware. The program is stored in a storage medium and includes a number of instructions to enable a microcontroller, a chip, or a processor. (processor) executes all or part of the steps of the method described in each embodiment of the present application. The foregoing storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes .
此外,本发明实施例的各种不同的实施方式之间也可以进行任意组合,只要其不违背本发明实施例的思想,其同样应当视为本发明实施例所公开的内容。In addition, various combinations of the embodiments of the present invention can also be arbitrarily combined, as long as it does not violate the idea of the embodiment of the present invention, it should also be regarded as the content disclosed in the embodiment of the present invention.
Claims (17)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201810597337.3A CN108877407A (en) | 2018-06-11 | 2018-06-11 | Methods, devices and systems and augmented reality glasses for supplementary AC |
| CN201810597337.3 | 2018-06-11 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2019237429A1 true WO2019237429A1 (en) | 2019-12-19 |
Family
ID=64337787
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2018/092815 Ceased WO2019237429A1 (en) | 2018-06-11 | 2018-06-26 | Method, apparatus and system for assisting communication, and augmented reality glasses |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN108877407A (en) |
| WO (1) | WO2019237429A1 (en) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111343554A (en) * | 2020-03-02 | 2020-06-26 | 开放智能机器(上海)有限公司 | Hearing aid method and system combining vision and voice |
| CN114007177B (en) * | 2021-10-25 | 2024-01-26 | 北京亮亮视野科技有限公司 | Hearing aid control method, device, hearing aid equipment and storage medium |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20070195012A1 (en) * | 2006-02-22 | 2007-08-23 | Konica Minolta Holdings Inc. | Image display apparatus and method for displaying image |
| CN103946733A (en) * | 2011-11-14 | 2014-07-23 | 谷歌公司 | Display audible indications on a wearable computing system |
| CN106092118A (en) * | 2016-08-18 | 2016-11-09 | 安玉 | The safe walk help monitor of old man based on Beidou satellite navigation system |
| CN106686223A (en) * | 2016-12-19 | 2017-05-17 | 中国科学院计算技术研究所 | Auxiliary dialogue system, method and smart phone for deaf-mute and normal people |
| CN107223277A (en) * | 2016-12-16 | 2017-09-29 | 深圳前海达闼云端智能科技有限公司 | A deaf-mute assisting method, device and electronic equipment |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103049077A (en) * | 2011-10-14 | 2013-04-17 | 鸿富锦精密工业(深圳)有限公司 | Sound feedback device and working method thereof |
| US9542958B2 (en) * | 2012-12-18 | 2017-01-10 | Seiko Epson Corporation | Display device, head-mount type display device, method of controlling display device, and method of controlling head-mount type display device |
| CN105554662A (en) * | 2015-06-30 | 2016-05-04 | 宇龙计算机通信科技(深圳)有限公司 | Hearing-aid glasses and hearing-aid method |
| CN206178272U (en) * | 2016-10-12 | 2017-05-17 | 语联网(武汉)信息技术有限公司 | External multi -lingual smart machine of glasses |
-
2018
- 2018-06-11 CN CN201810597337.3A patent/CN108877407A/en active Pending
- 2018-06-26 WO PCT/CN2018/092815 patent/WO2019237429A1/en not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20070195012A1 (en) * | 2006-02-22 | 2007-08-23 | Konica Minolta Holdings Inc. | Image display apparatus and method for displaying image |
| CN103946733A (en) * | 2011-11-14 | 2014-07-23 | 谷歌公司 | Display audible indications on a wearable computing system |
| CN106092118A (en) * | 2016-08-18 | 2016-11-09 | 安玉 | The safe walk help monitor of old man based on Beidou satellite navigation system |
| CN107223277A (en) * | 2016-12-16 | 2017-09-29 | 深圳前海达闼云端智能科技有限公司 | A deaf-mute assisting method, device and electronic equipment |
| CN106686223A (en) * | 2016-12-19 | 2017-05-17 | 中国科学院计算技术研究所 | Auxiliary dialogue system, method and smart phone for deaf-mute and normal people |
Also Published As
| Publication number | Publication date |
|---|---|
| CN108877407A (en) | 2018-11-23 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2019237427A1 (en) | Method, apparatus and system for assisting hearing-impaired people, and augmented reality glasses | |
| WO2019237428A1 (en) | Method and device for providing sound source information and augmented reality glasses | |
| CN113228029B (en) | Natural language translation in AR | |
| JP6017854B2 (en) | Information processing apparatus, information processing system, information processing method, and information processing program | |
| US10111013B2 (en) | Devices and methods for the visualization and localization of sound | |
| US9949056B2 (en) | Method and apparatus for presenting to a user of a wearable apparatus additional information related to an audio scene | |
| US9336779B1 (en) | Dynamic image-based voice entry of unlock sequence | |
| CN112764549B (en) | Translation method, translation device, translation medium and near-to-eye display equipment | |
| WO2021143574A1 (en) | Augmented reality glasses, augmented reality glasses-based ktv implementation method and medium | |
| US10916159B2 (en) | Speech translation and recognition for the deaf | |
| CN109061903A (en) | Data display method, device, intelligent glasses and storage medium | |
| CN112887654A (en) | Conference equipment, conference system and data processing method | |
| CN108983965A (en) | For alerting the method and apparatus and augmented reality glasses of abnormal sound source | |
| CN213876195U (en) | Glasses frame and intelligent navigation glasses | |
| WO2019237429A1 (en) | Method, apparatus and system for assisting communication, and augmented reality glasses | |
| JPWO2020095639A1 (en) | Information processing equipment, information processing methods and programs | |
| JP2019057047A (en) | Display control system, display control method and program | |
| CN118672389A (en) | Modifying sounds in a user's environment in response to determining a shift in the user's attention | |
| WO2015090182A1 (en) | Multi-information synchronization code learning device and method | |
| CN115077525A (en) | Navigation method and device based on auxiliary visual glasses | |
| US11159768B1 (en) | User groups based on artificial reality | |
| CN108509430A (en) | Intelligent glasses and its interpretation method | |
| CN112751582A (en) | Wearable device for interaction, interaction method and equipment, and storage medium | |
| CN119473005A (en) | A visual assistance system and method for the visually impaired based on a multimodal large model, smart glasses and blind-assisting application software | |
| CN118509572A (en) | Real-time caption display method of head-mounted display device, head-mounted display device and medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18922647 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 26/03/2021 |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 18922647 Country of ref document: EP Kind code of ref document: A1 |