CN112272361B

CN112272361B - Voice processing method and system

Info

Publication number: CN112272361B
Application number: CN202011180555.0A
Authority: CN
Inventors: 韩贵春; 李松; 张秉政; 张硕
Original assignee: Harbin Hytera Technology Corp ltd
Current assignee: Harbin Hytera Technology Corp ltd
Priority date: 2020-10-29
Filing date: 2020-10-29
Publication date: 2022-05-31
Anticipated expiration: 2040-10-29
Also published as: CN112272361A

Abstract

The application provides a voice processing method and a system, wherein the method comprises the following steps: the mobile switching center receives an uplink voice data packet which is sent by a calling terminal and contains a first voice type identifier, and sends the uplink voice data packet to each participating base station with the first voice type identifier; the mobile switching office acquires a translation voice data packet corresponding to the uplink voice data packet, and sends the translation voice data packet containing the second voice type identifier to each participating base station with the second voice type identifier so that each participating base station can send the translation voice data packet to a registered terminal according to a working time slot mode; and the registered terminals under each participating base station receive the voice data packet, compare the voice type identifier in the voice data packet with the local voice type identifier, receive and play the voice data packet if the voice type identifier is consistent with the local voice type identifier, and discard the voice data packet if the voice type identifier is inconsistent with the local voice type identifier. The method and the device can adapt to voice communication among different languages in a group calling scene.

Description

Voice processing method and system

Technical Field

The present application relates to the field of communications technologies, and in particular, to a method and a system for processing a voice.

Background

With the progress of society, users who speak different languages need to use the device for voice communication. For example, in the factory building and production process, a foreign organization often needs to perform group calling operation with local workers by using an interphone so as to perform voice communication.

Because of different languages, the problem of difficult communication caused by language obstruction exists in the voice communication process. There is a need for a solution that provides real-time translation during the intercom process, so that it can be applied to the voice communication between different languages in the group call scenario.

Disclosure of Invention

In view of this, the present application provides a voice processing method and system, which can be applied to voice communication between different languages in a group call scenario.

In order to achieve the above object, the present invention provides the following technical features:

a method of speech processing comprising:

the mobile exchange office receives a group call request initiated by a calling terminal and establishes a group call;

the mobile switching center receives an uplink voice data packet which is sent by the calling terminal and contains a first voice type identifier, and sends the uplink voice data packet to each participating base station with the first voice type identifier so that each participating base station can send the uplink voice data packet to a registered terminal according to a working time slot mode;

the mobile switching office acquires a translation voice data packet corresponding to the uplink voice data packet, and sends the translation voice data packet containing the second voice type identifier to each participating base station with the second voice type identifier, so that each participating base station can issue the translation voice data packet to a registered terminal according to a working time slot mode;

and the registered terminals under each participating base station receive the voice data packet, compare the voice type identifier in the voice data packet with the local voice type identifier, receive and play the voice data packet if the voice type identifier is consistent with the local voice type identifier, and discard the voice data packet if the voice type identifier is inconsistent with the local voice type identifier.

Optionally, the receiving, by the mobile switching office, a group call request initiated by a calling terminal and establishing a group call includes:

the mobile exchange office receives a group call request initiated by a calling terminal, and determines each participating base station of a group member terminal corresponding to a group identifier in the group call request and a voice type identifier of each group member terminal;

under the condition that the voice type identifications of all group member terminals are not completely consistent, determining first state information which represents different voice type identifications and needs to execute translation operation;

sending the first state information to each participating base station;

each participating base station determines a working mode according to the first state information and the voice type identification of the registered terminal; the registered terminal voice type identification of the participating base station is determined as a double-time slot working mode if the voice type identification is not completely consistent, and the registered terminal voice type identification of the participating base station is determined as a single-time slot working mode if the voice type identification is completely consistent.

Optionally, when the participating base station is in the dual-timeslot working mode, the participating base station issues the uplink voice data packet to the registered terminal through the first timeslot, and issues the translation voice data packet to the registered terminal through the second timeslot.

Optionally, if the mobile switching office is connected to a translation server, the obtaining, by the mobile switching office, the translation voice data packet corresponding to the uplink voice data packet includes:

sending the uplink voice data packet to the translation server;

acquiring a translated voice data packet translated by the translation server according to the second voice type identifier;

and adding a second voice type identifier in the translation voice data packet.

Optionally, the mobile switching office completes the sending of all uplink voice data packets, and sends an end frame to each participating base station with the first voice type identifier, so that each participating base station sends the end frame to a registered terminal; after receiving the end frame, the participating base station feeds back the received end frame to the mobile switching office;

the mobile exchange office sends an end frame to each participating base station with the second voice type identification after all the translation voice data packets are sent, so that each participating base station sends the end frame to a registered terminal; after receiving the end frame, the participating base station feeds back the received end frame to the mobile switching office;

and after receiving the feedback of all the participating base stations and receiving the ending frame, the mobile switching office sends a speaking right releasing instruction to each participating base station so that each participating base station can send the speaking right releasing instruction to the registered terminal.

Optionally, the method further includes:

displaying prompt information under the condition that the registered terminal receives the ending frame and does not receive the speaking right releasing instruction;

and after the registered terminal receives the command of releasing the speaking right, displaying that the group calling operation area is in an available state.

Optionally, after determining each participating base station corresponding to the group member terminal corresponding to the group identifier in the group call request and the voice type identifier of each group member terminal, the method further includes:

under the condition that the voice type identifications of all group member terminals are determined to be completely consistent, second state information which represents the same voice type identification and does not need to execute translation operation is determined, and the second state information is sent to all participating base stations;

and each participating base station determines a single-time-slot working mode according to the second state information.

Optionally, the method further includes:

initializing a terminal by using a configuration terminal to add a voice type identifier into the terminal;

each participating base station stores the voice type identification of the registered terminal in the coverage area and sends the voice type identification to the mobile switching office, so that the mobile switching office stores the voice type identification of the registered terminal in the coverage area of each base station;

the mobile exchange stores the group identifier of each group and the group member terminal identifier contained in each group identifier.

A speech processing method applied to a mobile switching office, the method comprising:

receiving a group call request initiated by a calling terminal and establishing a group call;

receiving an uplink voice data packet which is sent by the calling terminal and contains a first voice type identifier, and sending the uplink voice data packet to each participating base station with the first voice type identifier so that each participating base station can send the uplink voice data packet to a registered terminal according to a working time slot mode;

and acquiring a translation voice data packet corresponding to the uplink voice data packet, and sending the translation voice data packet containing the second voice type identifier to each participating base station with the second voice type identifier so that each participating base station can send the translation voice data packet to a registered terminal according to a working time slot mode.

A mobile switching office, comprising:

a memory for storing a software program;

and a processor for executing the software program in the memory and implementing the speech processing method.

A voice processing method is applied to a registered terminal under a participating base station, and comprises the following steps:

sending a group calling request or sending an uplink voice data packet containing a first voice type identifier;

or the like, or, alternatively,

receiving a voice data packet issued by a participating base station according to a working time slot mode; the voice data packet is an uplink voice data packet, or a translation voice data packet corresponding to the uplink voice data packet;

comparing the voice type identification in the voice data packet with the local voice type identification;

and if the voice data packets are consistent, receiving and playing the voice data packets, and if the voice data packets are inconsistent, discarding the voice data packets.

A terminal, comprising:

a memory for storing a software program;

A speech processing system comprising:

the mobile switching center and a plurality of participating base stations connected with the mobile switching center, wherein each participating base station has a plurality of registered terminals in the coverage area;

the mobile switching office is used for receiving a group call request initiated by a calling terminal and establishing a group call, receiving an uplink voice data packet which is sent by the calling terminal and contains a first voice type identifier, and sending the uplink voice data packet to each participating base station with the first voice type identifier; acquiring a translation voice data packet corresponding to the uplink voice data packet, and sending the translation voice data packet containing the second voice type identifier to each participating base station with the second voice type identifier;

each participating base station is used for sending an uplink voice data packet to the registered terminal according to the working time slot mode, or sending a translation voice data packet to the registered terminal according to the working time slot mode;

and the registered terminal is used for comparing the voice type identifier in the voice data packet with the local voice type identifier, receiving and playing the voice data packet if the voice type identifier in the voice data packet is consistent with the local voice type identifier, and discarding the voice data packet if the voice type identifier in the voice data packet is inconsistent with the local voice type identifier in the voice data packet.

the mobile switching office is used for receiving a group calling request initiated by a calling terminal, determining each participating base station of a group member terminal corresponding to a group identifier in the group calling request and the voice type identifier of each group member terminal; under the condition that the voice type identifications of all group member terminals are not completely consistent, determining first state information which represents different voice type identifications and needs to execute translation operation; sending the first state information to each participating base station;

each participating base station is used for determining a working mode according to the first state information and the voice type identification of the registered terminal; the registered terminal voice type identification of the participating base station is determined as a double-time slot working mode if the voice type identification is not completely consistent, and the registered terminal voice type identification of the participating base station is determined as a single-time slot working mode if the voice type identification is completely consistent.

the mobile switching office is used for sending the uplink voice data packet to the translation server; acquiring a translated voice data packet translated by the translation server according to the second voice type identifier; and adding a second voice type identifier in the translation voice data packet.

Optionally, the mobile switching office is configured to complete issuing all uplink voice data packets, and issue an end frame to each participating base station having the first voice type identifier;

each participating base station is used for sending an ending frame to a registered terminal; after receiving the end frame, the participating base station feeds back the received end frame to the mobile switching office;

the mobile exchange is also used for sending an end frame to each participating base station with the second voice type identifier after all the translated voice data packets are sent;

and the mobile switching office is used for receiving the instruction of releasing the speaking right sent by all the participating base stations after the finishing frame is fed back and received, so that all the participating base stations can send the instruction of releasing the speaking right to the registered terminal.

Optionally, the method further includes:

the registered terminal is used for displaying prompt information under the condition that an ending frame is received and a speaking right releasing instruction is not received; and after receiving the command of releasing the speaking right, displaying that the group calling operation area is in an available state.

the mobile switching office is used for determining second state information which represents the same voice type identifier and does not need to execute translation operation under the condition of determining that the voice type identifiers of all the group member terminals are completely consistent, and sending the second state information to all the participating base stations;

and each participating base station is used for determining a single-time-slot working mode according to the second state information.

Optionally, the method further comprises configuring the terminal;

the terminal is configured and used for carrying out initialization operation on the terminal so as to add a voice type identifier into the terminal;

each participating base station is used for storing the voice type identification of the registered terminal in the coverage area and sending the voice type identification to the mobile switching office, so that the mobile switching office can store the voice type identification of the registered terminal in the coverage area of each base station;

and the mobile switching office is used for storing the group identification of each group and the group member terminal identification contained in each group identification.

Through the technical means, the following beneficial effects can be realized:

after the group call is established, the MSO can respectively execute two parallel operations: a parallel operation is used for directly sending the uplink voice data packet containing the first voice type identifier to the participating base station which does not need the translation operation, and the participating base station sends the uplink voice data packet to the registered terminal. And the registered terminal under the participating base station receives and plays the uplink voice data packet only under the condition that the local voice type identifier is consistent with the first voice type identifier. This operation enables upstream voice data packet distribution to registered terminals.

And the other parallel operation is used for translating the uplink voice data packet to obtain a translated voice data packet, then sending the translated voice data packet to a participating base station needing translation operation, and the participating base station issuing the translated voice data packet to the registered terminal. And the registered terminal under the participating base station receives and plays the translated voice data packet only under the condition that the local voice type identifier is consistent with the second voice type identifier. This operation enables the delivery of the translated voice data packet to the registered terminal. The scheme can be suitable for real-time voice communication in a group calling scene.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1a or 1b is a schematic diagram of a speech processing system according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a speech processing method according to an embodiment of the present invention;

fig. 3 is a schematic diagram of another speech processing method according to an embodiment of the present invention;

fig. 4 is a schematic diagram of another speech processing method according to an embodiment of the present invention;

fig. 5 is a schematic diagram of another speech processing method according to an embodiment of the present invention;

fig. 6 is a schematic diagram of a mobile switching office according to an embodiment of the present invention;

fig. 7 is a schematic diagram of a terminal according to an embodiment of the present invention;

fig. 8 is a schematic diagram of a speech processing apparatus according to an embodiment of the present invention;

fig. 9 is a schematic diagram of another speech processing apparatus according to an embodiment of the present invention.

Detailed Description

Interpretation of terms:

the cluster controller, Trunk Station controller, is called TSC for short.

Mobile Switching Office, Mobile service Switching Office, MSO.

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The invention is suitable for the speech intercommunication between two different languages, such as between Chinese and English, between English and Japanese, between Japanese and American, between American and Russian, between Russian and Chinese, etc.; for convenience of illustration and understanding, the following examples take two different languages of "chinese" and "english" as examples, and the speech processing method is described in detail. It will be appreciated that speech talkback between other two different languages is equally applicable.

Referring to fig. 1a, the present invention provides a speech processing system comprising:

a mobile switching office 100.

A plurality of base stations 200 connected to a mobile switching office.

Each base station has a plurality of registered terminals 300 within its coverage area, which in this application are narrowband terminals that can be used to perform walkie-talkie operations. To simplify processing, subsequent mobile switching offices are denoted MSOs.

Referring to fig. 1b, the method further includes, on the basis of fig. 1 a: a translation server 400 connected to the MSO; and a configuration device 500.

For ease of understanding, fig. 1a and 1b take three base stations as an example: base1, Base2, and Base3, terminals with chinese users and english applications under Base1, terminals with only english users under Base2, and terminals with only chinese users under Base 3.

Before performing the on-line speech processing, some preparatory operations need to be performed:

firstly, a terminal is initialized by using a configuration terminal so as to add a voice type identifier into the terminal.

Before each terminal is used, the configuration device 600 is operated by a network administrator, and the voice type identification of the respective terminal is initialized by the configuration device 600. Initializing a Chinese type identifier for the terminal and storing the identifier in the terminal under the condition that a user of the terminal is a Chinese user; and initializing an English type identifier for the terminal and storing the identifier in the terminal under the condition that the terminal user is an English user.

Secondly, each base station stores the voice type identification of the registered terminal in the coverage area and sends the voice type identification to the MSO, so that the MSO can store the voice type identification of the registered terminal in the coverage area of each base station;

after the terminal is initialized, the terminal becomes a registered terminal, and the registered terminal can work in a group calling mode. The base station can interact with the registered terminals in the coverage area and send the voice type identification to the base station, so that the voice type identification of the registered terminals is stored in the base station.

Thirdly, the MSO stores the group identifier of each group, and the group member terminal identifier included in each group identifier. Each base station also sends the voice type identification of the registered terminal in the coverage area to the MSO, so that the MSO can store the voice type identification of the registered terminal in the coverage areas of different base stations respectively.

There may be different groups (groups) in the speech processing system, different registered terminals in different groups, and the MSO stores the group identification of different groups and the speech type identification of the group member terminals of different groups.

The invention provides a voice processing method, which is applied to a voice processing system shown in FIG. 1a or FIG. 1 b. Referring to fig. 2, the method comprises the following steps:

step S201: the mobile exchange office receives the group call request initiated by the calling terminal and establishes a group call.

For ease of description, the mobile switching office is hereinafter denoted as MSO. Referring to fig. 3, this step can be implemented in the following manner:

step S301: a calling terminal initiates a group calling request to an MSO through a calling base station; wherein the group call request includes a group identification.

A push area of "PTT allowed" may be displayed on the display screen of the calling terminal, indicating that the calling terminal has the right of talk when the push area of "PTT allowed" is colored, and indicating that the calling terminal does not have the right of talk when the push area of "PTT allowed" is gray.

In the case where the calling terminal has the right to speak, the user can select one group among different groups and can press the "PTT allowed" pressing area for a long time. At this time, the calling terminal initiates a group call request, and the group call request includes a group identifier.

And the calling base station sends the group calling request to the MSO.

Step S302: the MSO receives a group call request initiated by a calling terminal, determines each group member terminal identification corresponding to the group identification, each participating base station corresponding to each group member terminal identification, and determines a voice type identification corresponding to each group member terminal identification.

The MSO receives a group call request initiated by a calling terminal, determines the identification of each group member terminal in the group based on the group identification in the group call request, and acquires a base station corresponding to the identification of each group member terminal, which is called a participating base station.

For example, each group member terminal identifies three corresponding base stations as Bsae1, Bsae2, and Bsae3 in fig. 1.

Step S303: and judging whether the voice type identifications of all the group member terminals are completely consistent, and if not, entering the step S304.

Step S304: and under the condition that the voice type identifications of all the group member terminals are not completely consistent, determining first state information which represents different voice type identifications and needs to execute translation operation.

After determining the identifier of each group member terminal, the MSO may determine the voice type identifier corresponding to each group member terminal. If some group member terminal identifiers correspond to a first voice type identifier (such as a Chinese type identifier), and some group member terminal identifiers correspond to a second voice type identifier (such as an English type identifier), it indicates that the voice type identifiers of the group member terminals are not completely consistent.

And under the condition that the voice type identifications of all the group member terminals are not completely consistent, determining first state information which represents different voice type identifications and needs to execute translation operation. And the MSO sends the first state information to each participating base station.

And sending the first state information to each participating base station.

Step S305: and the MSO sends the first state information to each participating base station.

The MSO also sends first status information to each participating base station for each participating base station to determine the operating slot mode.

Step S306: each participating base station determines a working mode according to the first state information and the voice type identification of the registered terminal; the registered terminal voice type identification of the participating base station is determined as a double-time slot working mode if the voice type identification is not completely consistent, and the registered terminal voice type identification of the participating base station is determined as a single-time slot working mode if the voice type identification is completely consistent.

The participating base station has registered terminals in the coverage area, acquires the voice type identifiers of the registered terminals in the storage space, and if the voice type identifiers of the registered terminals are not completely consistent (for example, Bsae1 in FIG. 1), the participating base station determines that the participating base station works in a double-slot working mode, so that one slot is used for receiving the uplink voice data packet subsequently, and the other slot is used for receiving the translated voice data packet, thereby receiving the two types of voice data packets in parallel, improving the receiving efficiency and reducing the time delay.

The participating base station obtains the voice type identifier of the registered terminal in the storage space, and if the voice type identifiers of the registered terminals are completely consistent (for example, Bsae2 and Bsae3 in fig. 1), the participating base station determines to operate in the single-slot operating mode, so as to subsequently receive the uplink voice data packet or translate the voice data packet by using a single slot.

The calling terminal may transmit voice data during a long press on the "PTT allowed" press zone after the group call is established. As the speaking time becomes longer, the voice data will also become larger gradually, and in order to improve the real-time performance and the sending efficiency, the calling terminal generates an uplink voice data packet at intervals of a preset time (for example, 60ms), and the uplink voice data packet includes a voice type identifier of the calling terminal, which is called as a first voice type identifier for convenience of differentiation.

Step S201 then proceeds to step S202: and the calling terminal sends an uplink voice data packet containing the first voice type identifier.

The calling terminal constructs an uplink voice data packet containing the voice fragment and the first voice type identifier, and sends the uplink voice data packet to the calling base station, and the calling base station sends the uplink voice data packet to the MSO.

The flow proceeds to step S203: and the MSO receives an uplink voice data packet which is sent by the calling terminal and contains a first voice type identifier, and sends the uplink voice data packet to each participating base station with the first voice type identifier so that each participating base station can send the uplink voice data packet to a registered terminal according to a working time slot mode.

The MSO already knows the group id of the group call request during the process of establishing the group call, and the voice type id status of each participating Base station in the group id, for example, terminals with different voice type ids under Base1, terminals with the same voice type id under Base2 and being english type ids, and terminals with the same voice type id under Base3 and being chinese type ids.

Therefore, after receiving the uplink voice data packet containing the first voice type identifier, the MSO determines each participating base station containing the first voice type identifier from the participating base stations (i.e. the terminal completely under the participating base station is the first voice type identifier, and the terminal under the participating base station has both the first voice type identifier and the second voice type identifier).

And the MSO sends the uplink voice data packet to each participating base station containing the first voice type identifier so that each participating base station can send the uplink voice data packet to the registered terminal according to the working time slot mode.

If the participating base station is completely the terminal with the first voice type identifier, the participating base station works in a single time slot working mode, and the participating base station directly sends an uplink voice data packet to the registered terminal through a single time slot.

If the participating base station has the terminal with the first voice type identifier and the terminal with the second voice type identifier, the participating base station works in a double-time-slot working mode, and the participating base station sends an uplink voice data packet to the registered terminal through the first time slot.

Step S204: and the MSO acquires a translation voice data packet corresponding to the uplink voice data packet, and sends the translation voice data packet containing the second voice type identifier to each participating base station with the second voice type identifier, so that each participating base station can send the translation voice data packet to a registered terminal according to a working time slot mode.

The steps S204 and S203 are parallel steps, and the execution process is not in sequence.

The MSO determines in step S201 that the first state information, i.e. the MSO, needs to perform a translation operation.

In the speech processing system shown in fig. 1a, the MSO may have a built-in translation module, and perform a translation operation on the upstream speech data packet by using the translation module, so as to obtain a translated speech data packet. The MSO adds a second voice type identification to the translated voice data packet.

In the speech processing system shown in fig. 1b, the MSO may send an uplink speech data packet to the translation server, and obtain a translation speech data packet sent by the translation server. The MSO adds a second voice type identification to the translated voice data packet.

The MSO determines each participating base station that contains the second voice class identification (i.e., the terminal under the participating base station that is completely the second voice class identification, and the terminal under the participating base station that has both the first voice class identification and the second voice class identification).

If the participating base station is the terminal completely identified by the second voice type, the participating base station works in a single time slot working mode, and the participating base station directly sends the translation voice data packet to the registered terminal through a single time slot.

If the participating base station has the terminal with the first voice type identifier and the terminal with the second voice type identifier, the participating base station works in a double-time-slot working mode, and the participating base station sends the translation voice data packet to the registered terminal through the second time slot.

By adopting a double-time-slot mode to respectively send the uplink voice data packet and the translation voice data packet, the voice data packet can be sent in parallel by two languages, so that the problem that the translation voice data packet can be sent only after the uplink voice data packet is sent under the condition of single time slot is solved, and the time delay is shortened.

Step S205: and the registered terminals under each participating base station receive the voice data packet, compare the voice type identifier in the voice data packet with the local voice type identifier, receive and play the voice data packet if the voice type identifier is consistent with the local voice type identifier, and discard the voice data packet if the voice type identifier is inconsistent with the local voice type identifier.

The registered terminal under each participating base station receives the voice data packet (uplink voice data packet and/or translated voice data packet), and then compares the voice type identifier in the voice data packet with the local voice type identifier.

If the voice data packet is consistent with the local voice type identifier, namely the voice which can be understood by the user is directly received and played. If the voice data packet is inconsistent with the local voice type identifier, the voice data packet is indicated to be inconsistent with the local voice type identifier, namely the voice data packet is discarded if the voice data packet cannot be understood by the user.

Taking Bsae1, Bsae2, and Bsae3 in fig. 1 as examples, a registered terminal in Base1 receives both an uplink voice data packet and a translated voice data packet, but the registered terminal only receives and plays a voice data packet consistent with its own voice type identifier, so as to ensure that a user can receive an understandable voice; voice data packets inconsistent with the voice type identifier are discarded to ensure that the user does not receive unintelligible voice.

The Base2 only receives the translation voice data packet and directly receives and plays the translation voice data packet; bsae3 receives only the upstream voice packets and directly receives and plays the upstream voice packets.

Step S206: after receiving the end frame fed back by each base station, the MSO sends a speaking right releasing instruction to each participating base station, so that each participating base station sends the speaking right releasing instruction to the registered terminal.

Continuously sending an uplink voice data packet containing a first voice type identifier until the speaking right is released during the calling terminal executing group calling; and the calling base station generates an end frame after receiving the talk right release information and sends the end frame to the MSO.

The MSO repeatedly executes step S203 until all uplink voice data packets are completely transmitted, and then transmits an end frame to each participating base station having the first voice type identifier, so that each participating base station transmits the end frame to the registered terminal, and the participating base station feeds back the received end frame to the MSO after receiving the end frame.

The MSO repeatedly executes the step S204 until all the translation voice data packets are completely issued, and then issues an end frame to each participating base station with the second voice type identification so that each participating base station can issue the end frame to the registered terminal; and after receiving the end frame, the participating base station feeds back the received end frame to the MSO.

After receiving the end frame received by all the participating base stations, the MSO indicates that all the uplink voice data packets have been sent, and all the translated voice data packets have been sent, that is, the one-time group call of the calling terminal has been executed.

The MSO forbids the group calling authority of other terminals in the calling time limit of the calling terminal, optionally during the period, in order to improve the experience, the prompt information in the speaking can be displayed. Preferably, the Chinese "speaking" is displayed for the Chinese terminal, and the English "talking" is displayed for the English terminal.

Therefore, after the group call of the calling terminal is finished, the calling terminal sends a speaking right releasing instruction to each participating base station, so that each participating base station sends the speaking right releasing instruction to the registered terminal.

Step S207: and after the registered terminal receives the command of releasing the speaking right, displaying that the group calling operation area is in an available state.

A push area of "PTT allowed" may be displayed on the display screen of the registered terminal, and the color of the push area of "PTT allowed" indicates that the group call operation area is in an available state.

Optionally, since it takes some time for the translation voice data packet to be translated, when all the uplink voice data packets are completely issued, all the translation voice data packets are not completely issued, and therefore, after the uplink voice data packets are completely issued, before the translation voice data packets are completely issued, a time difference exists.

For the terminals which finish the transmission of all the uplink voice data packets, a period of idle time exists in the period, but the speaking right of the terminals cannot be recovered in the idle time. In order to improve user experience, prompt information in speech translation is displayed when a registered terminal receives an end frame and does not receive a talk right release instruction. Preferably, the Chinese language "in speech translation" is displayed for the Chinese terminal, and the "talking" is continuously displayed for the English terminal.

Through the technical means, the following beneficial effects can be realized:

And the other parallel operation is used for translating the uplink voice data packet to obtain a translated voice data packet, then sending the translated voice data packet to a participating base station needing translation operation, and the participating base station issuing the translated voice data packet to the registered terminal. And the registered terminal under the participating base station receives and plays the translated voice data packet only under the condition that the local voice type identifier is consistent with the second voice type identifier. This operation enables the delivery of the translated voice data packet to the registered terminal.

The scheme can be suitable for real-time voice communication in a group calling scene.

Referring to fig. 4, the present invention further provides a speech processing method, which may include the following steps:

step S401: the MSO receives the group call request initiated by the calling terminal and establishes the group call.

Referring to fig. 5, this step can be implemented by the following steps:

step S501: a calling terminal initiates a group calling request to an MSO through a calling base station; wherein the group call request includes a group identification.

Step S502: the MSO receives a group call request initiated by a calling terminal, determines each group member terminal identification corresponding to the group identification, each participating base station corresponding to each group member terminal identification, and determines a voice type identification corresponding to each group member terminal identification.

Step S503: and judging whether the voice type identifications of all the group member terminals are completely consistent, and if not, entering the step S504.

Step S504: and under the condition that the voice type identifications of all the group member terminals are completely consistent, determining second state information which represents the same voice type identification and does not need to execute translation operation.

Step S505: and the MSO sends the second state information to each participating base station.

Step S506: and each participating base station determines a single-time-slot working mode according to the second state information.

Because the group member terminals are all in the same language, the participating base stations do not need to adopt a double-time-slot working mode, and only need to adopt a single-time-slot working mode.

Step S402: and the calling terminal sends an uplink voice data packet containing the first voice type identifier.

Step S403: and the MSO receives an uplink voice data packet which is sent by the calling terminal and contains a first voice type identifier, and sends the uplink voice data packet to each participating base station with the first voice type identifier so that each participating base station can send the uplink voice data packet to a registered terminal according to a working time slot mode.

Step S404: and the registered terminals under each participating base station receive the uplink voice data packet, compare the voice type identifier in the uplink voice data packet with the local voice type identifier, receive and play the uplink voice data packet if the voice type identifier is consistent with the local voice type identifier, and discard the uplink voice data packet if the voice type identifier is inconsistent with the local voice type identifier.

Step S405: after receiving the end frame fed back by each base station, the MSO sends a speaking right releasing instruction to each participating base station, so that each participating base station sends the speaking right releasing instruction to the registered terminal.

The MSO repeatedly executes step S403 until all uplink voice data packets are completely transmitted, and then transmits an end frame to each participating base station with the first voice type identifier, so that each participating base station transmits the end frame to the registered terminal, and the participating base station feeds back the received end frame to the MSO after receiving the end frame.

After receiving the feedback of all participating base stations and receiving the end frame, the MSO indicates that all uplink voice data packets have been sent, that is, that one-time group calling of the calling terminal has been executed.

Step S406: and after the registered terminal receives the command of releasing the speaking right, displaying that the group calling operation area is in an available state.

Referring to fig. 6, the present application provides a mobile switching office comprising:

a memory for storing a software program;

and the processor is used for executing the software program in the memory and realizing the voice processing method.

Referring to fig. 7, the present application provides a terminal including:

a memory for storing a software program;

The voice processing method in fig. 6 and fig. 7 has already been clearly explained in the embodiment of fig. 2 and fig. 4, and is not repeated herein.

Referring to fig. 1a or 1b, the present application provides a speech processing system comprising:

the mobile switching center and a plurality of base stations connected with the mobile switching center, wherein each base station has a plurality of registered terminals in the coverage area;

the MSO is used for receiving a group call request initiated by a calling terminal and establishing a group call, receiving an uplink voice data packet which is sent by the calling terminal and contains a first voice type identifier, and sending the uplink voice data packet to each participating base station with the first voice type identifier; acquiring a translation voice data packet corresponding to the uplink voice data packet, and sending the translation voice data packet containing the second voice type identifier to each participating base station with the second voice type identifier;

each base station is used for sending an uplink voice data packet to the registered terminal according to the working time slot mode, or sending a translation voice data packet to the registered terminal according to the working time slot mode;

The detailed description of the speech processing system has already been clearly explained in the above embodiments of fig. 2 and 4, and is not repeated here.

Referring to fig. 8, the present application provides a speech processing apparatus applied to a mobile switching office, the apparatus comprising:

a first receiving unit 81, configured to receive a group call request initiated by a calling terminal and establish a group call;

a second receiving unit 82, configured to receive an uplink voice data packet that is sent by the calling terminal and includes a first voice type identifier, and send the uplink voice data packet to each participating base station having the first voice type identifier, so that each participating base station issues the uplink voice data packet to a registered terminal according to a working timeslot mode.

An obtaining unit 83, configured to obtain a translated voice data packet corresponding to the uplink voice data packet, and send the translated voice data packet including the second voice type identifier to each participating base station having the second voice type identifier, so that each participating base station issues the translated voice data packet to a registered terminal according to a working timeslot mode.

The registered terminals under each participating base station receive the voice data packet, compare the voice type identification in the voice data packet with the local voice type identification, if the voice type identification is consistent with the local voice type identification, the voice data packet is received and played, and if the voice data packet is inconsistent with the local voice type identification, the voice data packet is discarded.

Referring to fig. 9, the present application provides a voice processing apparatus applied to a registered terminal under a participating base station, the apparatus including:

a third receiving unit 91, configured to receive a voice data packet sent by a participating base station according to a working timeslot mode; the voice data packet is an uplink voice data packet or a translation voice data packet corresponding to the uplink voice data packet.

The comparing unit 92 is configured to compare the voice type identifier in the voice data packet with the local voice type identifier.

And the processing unit 93 is configured to receive and play the voice data packet if the voice data packet is consistent with the voice data packet, and discard the voice data packet if the voice data packet is inconsistent with the voice data packet.

The speech processing apparatus in fig. 8 and 9 has already been clearly explained in the above embodiment of fig. 2 and 4, and is not described again here.

The functions described in the method of the present embodiment, if implemented in the form of software functional units and sold or used as independent products, may be stored in a storage medium readable by a computing device. Based on such understanding, part of the contribution to the prior art of the embodiments of the present application or part of the technical solution may be embodied in the form of a software product stored in a storage medium and including several instructions for causing a computing device (which may be a personal computer, a server, a mobile computing device or a network device) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method of speech processing, comprising:

the mobile switching center receives an uplink voice data packet which is sent by the calling terminal and contains a first voice type identifier, and sends the uplink voice data packet to each participating base station with the first voice type identifier so that each participating base station can send the uplink voice data packet to a registered terminal according to a working mode;

the mobile switching office acquires a translation voice data packet corresponding to the uplink voice data packet, and sends the translation voice data packet containing a second voice type identifier to each participating base station with the second voice type identifier, so that each participating base station can send the translation voice data packet to a registered terminal according to a working mode;

the registered terminals under each participating base station receive the voice data packet, compare the voice type identification in the voice data packet with the local voice type identification, if the voice data packet is consistent with the local voice type identification, the voice data packet is received and played, and if the voice data packet is inconsistent with the local voice type identification, the voice data packet is discarded;

wherein, the mobile exchange office receives the group call request initiated by the calling terminal and establishes the group call, including:

the mobile switching office receives a group call request initiated by the calling terminal, and determines each participating base station of a group member terminal corresponding to a group identifier in the group call request and a voice type identifier of each group member terminal;

under the condition that the voice type identifications of all group member terminals are not completely consistent, first state information which represents different voice type identifications and needs to execute translation operation is determined;

sending the first state information to each participating base station;

the each participating base station determines the working mode according to the first state information and the voice type identification of the registered terminal; the method comprises the steps that a registered terminal of a participating base station is determined to be in a double-time-slot working mode if voice type identifications of the registered terminal of the participating base station are not completely consistent, and the registered terminal of the participating base station is determined to be in a single-time-slot working mode if the voice type identifications of the registered terminal of the participating base station are completely consistent.

2. The method of claim 1,

and when the participating base station is in the double-time-slot working mode, the participating base station issues the uplink voice data packet to the registered terminal through a first time slot, and issues the translation voice data packet to the registered terminal through a second time slot.

3. The method of claim 1, wherein said mobile switching office is connected to a translation server, and said obtaining a translated voice packet corresponding to said upstream voice packet by said mobile switching office comprises:

sending the uplink voice data packet to the translation server;

and adding a second voice type identifier in the translation voice data packet.

4. The method of claim 1,

the mobile exchange office completes the sending of all uplink voice data packets, and sends an end frame to each participating base station with the first voice type identifier, so that each participating base station sends the end frame to a registered terminal; after receiving the end frame, the participating base station feeds back the received end frame to the mobile switching office;

and after receiving the feedback of all the participating base stations and receiving the ending frame, the mobile switching center sends a speaking right releasing instruction to each participating base station, so that each participating base station sends the speaking right releasing instruction to the registered terminal.

5. The method of claim 4, further comprising:

displaying prompt information under the condition that the registered terminal receives the ending frame and does not receive the call right releasing instruction;

6. The method of claim 1, wherein after determining the respective participating base stations corresponding to the group member terminals corresponding to the group identifier in the group call request, and the voice type identifiers of the respective group member terminals, further comprising:

under the condition that the voice type identifications of all group member terminals are determined to be completely consistent, second state information which represents the same voice type identification and does not need to execute translation operation is determined, and the second state information is sent to all the participating base stations;

and each participating base station is determined to be in the single-time-slot working mode according to the second state information.

7. The method of claim 1, further comprising:

initializing the terminal by using the configuration terminal so as to add a voice type identifier into the terminal;

each participating base station stores the voice type identification of the registered terminal in the coverage area of the participating base station and sends the voice type identification to a mobile switching office, so that the mobile switching office stores the voice type identification of the registered terminal in the coverage area of each participating base station;

8. A speech processing method, applied to a mobile switching office, the method comprising:

receiving an uplink voice data packet which is sent by the calling terminal and contains a first voice type identifier, and sending the uplink voice data packet to each participating base station with the first voice type identifier so that each participating base station can send the uplink voice data packet to a registered terminal according to a working mode;

acquiring a translation voice data packet corresponding to the uplink voice data packet, and sending the translation voice data packet containing a second voice type identifier to each participating base station with the second voice type identifier so that each participating base station can send the translation voice data packet to a registered terminal according to a working mode;

wherein, the receiving a group call request initiated by a calling terminal and establishing a group call comprises:

sending the first state information to each participating base station;

the each participating base station determines the working mode according to the first state information and the voice type identification of the registered terminal; and determining the registered terminal of the participating base station as a double-time-slot working mode if the voice type identifications of the registered terminals of the participating base station are not completely consistent, and determining the registered terminal of the participating base station as a single-time-slot working mode if the voice type identifications of the registered terminals of the participating base station are completely consistent.

9. A mobile switching office, comprising:

a memory for storing a software program;

a processor for executing the software program in the memory and implementing the speech processing method as claimed in claim 8.

10. A speech processing method applied to a registered terminal under a participating base station, the method comprising:

or the like, or, alternatively,

receiving a voice data packet issued by a participating base station according to a working mode; the voice data packet is an uplink voice data packet, or a translation voice data packet corresponding to the uplink voice data packet; the working modes comprise a single time slot working mode and a double time slot working mode; if the voice type identifications of the registered terminals of the participating base stations are completely consistent, determining the participating base stations to be in the single-time-slot working mode; if the voice type identifications of the registered terminals of the participating base stations are not completely consistent, determining the two-time-slot working mode;

11. A terminal, comprising:

a memory for storing a software program;

a processor for executing the software program in the memory and implementing the speech processing method as claimed in claim 10.

12. A speech processing system, comprising:

the mobile switching office is used for receiving a group call request initiated by a calling terminal and establishing a group call, receiving an uplink voice data packet which is sent by the calling terminal and contains a first voice type identifier, and sending the uplink voice data packet to each participating base station with the first voice type identifier; acquiring a translation voice data packet corresponding to the uplink voice data packet, and sending the translation voice data packet containing a second voice type identifier to each participating base station with the second voice type identifier;

each participating base station is used for sending an uplink voice data packet to the registered terminal according to the working mode, or sending a translation voice data packet to the registered terminal according to the working mode;

the registered terminal is used for comparing the voice type identifier in the voice data packet with the local voice type identifier, receiving and playing the voice data packet if the voice type identifier in the voice data packet is consistent with the local voice type identifier, and discarding the voice data packet if the voice type identifier in the voice data packet is inconsistent with the local voice type identifier in the voice data packet;

the mobile switching center is used for receiving a group call request initiated by a calling terminal and establishing a group call, and comprises the following steps:

the mobile switching office is used for receiving a group call request initiated by the calling terminal, and determining each participating base station of a group member terminal corresponding to a group identifier in the group call request and a voice type identifier of each group member terminal; under the condition that the voice type identifications of all group member terminals are not completely consistent, determining first state information which represents different voice type identifications and needs to execute translation operation; sending the first state information to each participating base station;

each participating base station is used for determining the working mode according to the first state information and the voice type identification of the registered terminal; the method comprises the steps that a registered terminal of a participating base station is determined to be in a double-time-slot working mode if voice type identifications of the registered terminal of the participating base station are not completely consistent, and the registered terminal of the participating base station is determined to be in a single-time-slot working mode if the voice type identifications of the registered terminal of the participating base station are completely consistent.

13. The system of claim 12,

14. The system of claim 12, wherein said mobile switching office is coupled to a translation server, and said mobile switching office obtaining a translated voice data packet corresponding to said upstream voice data packet comprises:

15. The system of claim 12,

the mobile switching office is used for sending an end frame to each participating base station with the first voice type identifier after all the uplink voice data packets are sent;

and the mobile switching office is used for receiving the voice right releasing instruction sent by all the participating base stations after the ending frame is fed back and received, so that each participating base station can send the voice right releasing instruction to the registered terminal.

16. The system of claim 15, further comprising:

17. The system of claim 12, wherein after determining the respective participating base stations corresponding to the group member terminals corresponding to the group identifier in the group call request, and the voice type identifiers of the respective group member terminals, further comprising:

and each participating base station is used for determining the single-time-slot working mode according to the second state information.

18. The system of claim 12, further comprising a configuration terminal;

each participating base station is used for storing the voice type identification of the registered terminal in the coverage area and sending the voice type identification to the mobile switching office, so that the mobile switching office can store the voice type identification of the registered terminal in the coverage area of each participating base station;