CN1747465A

CN1747465A - Realization of speech service

Info

Publication number: CN1747465A
Application number: CNA2004100737457A
Authority: CN
Inventors: 路讴
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2004-09-09
Filing date: 2004-09-09
Publication date: 2006-03-15
Anticipated expiration: 2024-09-09
Also published as: CN100525297C

Abstract

The invention discloses a method for implementing voice services. The method is to set a VDP layer on the RAN for mutual conversion between voice frame data and voice packet data, and the sender RAN receives the coded voice from the sender MS. After frame data, the voice frame data is converted into voice packet data by its own VDP layer and sent to the packet network; after receiving the voice packet data from the packet network, the receiver RAN converts the voice packet data into voice packet data by its own VDP layer voice frame data, and send the voice frame data to the receiver MS, and the receiver MS decodes the voice frame data and plays it. After applying the method of the present invention, the RAN is used to convert between voice frame data and voice packet data, without setting a special packet gateway device for switching between voice frames and voice packets in the system, or adding voice frames on the terminal And voice packet conversion function, thereby reducing the cost of the system and simplifying the structure of the system.

Description

A method for realizing voice service

技术领域technical field

本发明涉及语音传输技术，特别是涉及在分组网络中实现语音业务的方法。The invention relates to voice transmission technology, in particular to a method for realizing voice services in a packet network.

背景技术Background technique

传统的电路方式的语音传输需要64kbps的带宽，而随着2.5G和3G无线网络的发展，引入了分组数据业务，分组语音技术通常采用小于8Kbps的编码方式，由于用户通话静默期的存在，分组网络传输分组语音的实际平均带宽往往会低至2到3Kbps。所以，分组语音技术能够有效节约带宽，从而降低了分组网络的组网和运营成本，并且，由于分组网络具有组网灵活的特点，使其越来越受到重视，故得以广泛应用于无线网络中。Traditional circuit-based voice transmission requires a bandwidth of 64kbps. With the development of 2.5G and 3G wireless networks, packet data services are introduced. Packet voice technology usually adopts a coding method less than 8Kbps. The actual average bandwidth of a network for packet voice tends to be as low as 2 to 3Kbps. Therefore, the packet voice technology can effectively save bandwidth, thereby reducing the networking and operating costs of the packet network, and, because the packet network has the characteristics of flexible networking, it has attracted more and more attention, so it can be widely used in wireless networks .

在实现分组语音业务的系统中，需要设置语音帧数据和语音分组数据的转换的单元，进行语音帧数据和语音分组数据之间的转换。并且，分组网络中存在分组抖动，抖动的大小反映在一定时间之内，不同分组数据的端到端时延的差值上，分组数据之间端到端时延的差值越大，说明分组网络的抖动越严重。在利用分组网络传输语音数据时，抖动会造成语音数据不能及时播放，引起声音空缺，从而影响通话的语音质量，所以，需要对语音分组数据进行抖动处理。In the system for implementing packet voice services, it is necessary to set up a unit for converting voice frame data and voice packet data to perform conversion between voice frame data and voice packet data. In addition, there is packet jitter in the packet network, and the size of the jitter is reflected within a certain period of time. In terms of the difference in end-to-end delay of different packet data, the greater the difference in end-to-end delay between packet data, the packet The more serious the jitter of the network is. When using the packet network to transmit voice data, the jitter will cause the voice data to not be played in time, causing sound vacancy, thereby affecting the voice quality of the call. Therefore, it is necessary to perform jitter processing on the voice packet data.

CDMA系统实现分组语音业务的方式有多种，其中较传统的一种实现分组语音业务的系统结构如图1所示，该组网方式下分组语音业务的具体实现方式是：语音信息在发送方移动台(MS)上编码并以语音帧数据的格式从发送方MS经由无线接入网(RAN)和移动交换中心(MSC)发送至发送方分组语音网关，在发送方分组语音网关将语音帧数据转换为语音分组数据，由分组网络传到接收方，接收方分组语音网关对来自分组网络的语音分组数据进行抖动处理并将语音分组数据转换为语音帧数据，然后通过MSC和RAN发送至接收方MS，接收方MS将语音帧数据解码为语音信号。There are many ways to realize packet voice service in CDMA system, among them, the system structure of a more traditional one to realize packet voice service is shown in Figure 1. Encoded on the mobile station (MS) and sent from the sender MS to the sender packet voice gateway via the radio access network (RAN) and mobile switching center (MSC) in the format of voice frame data, and the voice frame is transmitted by the sender packet voice gateway The data is converted into voice packet data, which is transmitted from the packet network to the receiver. The receiver’s packet voice gateway performs jitter processing on the voice packet data from the packet network and converts the voice packet data into voice frame data, and then sends it to the receiver through MSC and RAN. The party MS and the receiver MS decode the voice frame data into voice signals.

该组网方式的缺陷是需要设置专门的分组语音网关设备对语音帧数据和语音分组数据进行转换，并进行抖动处理，增加了额外的成本。The disadvantage of this networking method is that it is necessary to set up a special packet voice gateway device to convert voice frame data and voice packet data, and perform jitter processing, which increases additional costs.

为了解决上述分组语音的实现方式的缺陷，出现了另一种CDMA系统实现分组语音业务的方式，这种分组语音业务所采用的系统结构如图2所示，该组网方式下分组语音业务的具体实现方式是：直接在发送方MS上将语音数据打包为语音分组数据，通过发送方RAN发送至分组网络，分组网络将语音分组数据传输至接收方RAN，接收方RAN发送语音分组数据至接收方MS，接收方MS对语音分组数据进行抖动处理并将该数据直接解码为语音信号。In order to solve the defects of the above packet voice implementation method, another CDMA system has emerged to realize the packet voice service. The system structure adopted by this packet voice service is shown in Figure 2. The packet voice service in this networking mode The specific implementation method is: directly pack the voice data into voice packet data on the sender MS, send it to the packet network through the sender RAN, the packet network transmits the voice packet data to the receiver RAN, and the receiver RAN sends the voice packet data to the receiver The party MS and the receiver MS perform jitter processing on the voice packet data and directly decode the data into a voice signal.

可以看出，第二种组网方式和第一种组网方式相比，无需设置专门的分组语音网关，但是，本组网方式也存在一些缺陷：It can be seen that compared with the first networking method, the second networking method does not need to set up a special packet voice gateway. However, this networking method also has some defects:

首先，需要由手机实现语音分组数据和语音帧数据之间的转换，而早期的手机多数不能支持该功能。First of all, the conversion between voice packet data and voice frame data needs to be realized by the mobile phone, and most early mobile phones cannot support this function.

其次，这种组网方式需要由手机来进行抖动处理，导致手机处理复杂，需要更高的处理性能，提高了手机的成本。Secondly, this networking method requires the mobile phone to perform jitter processing, which leads to complex processing of the mobile phone, requires higher processing performance, and increases the cost of the mobile phone.

发明内容Contents of the invention

本发明的主要目的在于提供一种实现语音业务的方法，由RAN实现语音帧数据和语音分组数据之间的转换。The main purpose of the present invention is to provide a method for implementing voice services, and the RAN realizes the conversion between voice frame data and voice packet data.

本发明的目的是通过如下技术方案实现的：The purpose of the present invention is achieved through the following technical solutions:

一种实现语音业务的方法，该方法包括如下步骤：A method for realizing a voice service, the method comprising the steps of:

在无线接入网RAN的协议层上增加用于进行语音分组数据和语音帧数据互相转换的语音数据协议VDP层；Add a voice data protocol VDP layer for mutual conversion between voice packet data and voice frame data on the protocol layer of the radio access network RAN;

发送方RAN接收到发送方移动台MS发来的经过编码的语音帧数据后，由自身的VDP层将语音帧数据转换为语音分组数据，并将语音分组数据发送至分组网络；After the sender RAN receives the encoded voice frame data from the sender mobile station MS, its own VDP layer converts the voice frame data into voice packet data, and sends the voice packet data to the packet network;

接收方RAN接收到来自分组网络的语音分组数据后，由自身的VDP层将语音分组数据转换为语音帧数据，并将语音帧数据发送至接收方MS，接收方MS对语音帧数据解码并播放。After receiving the voice packet data from the packet network, the receiver RAN converts the voice packet data into voice frame data by its own VDP layer, and sends the voice frame data to the receiver MS, and the receiver MS decodes and plays the voice frame data .

其中，该方法进一步包括：在RAN中为每一条接通的链路分配一定容量的暂时存储语音分组数据的抖动缓冲区，Wherein, the method further includes: allocating a jitter buffer with a certain capacity for temporarily storing voice packet data in the RAN for each connected link,

所述接收方RAN接收到语音分组数据之后，将该语音分组数据转换为语音帧数据之前，进一步包括：接收方RAN将接收到的语音分组数据存入抖动缓冲区，并在满足取出条件时从抖动缓冲区取出语音分组数据至VDP层。After the receiving side RAN receives the voice packet data, before converting the voice packet data into voice frame data, it further includes: the receiving side RAN stores the received voice packet data into the jitter buffer, and when the fetching condition is satisfied, from The jitter buffer takes voice packet data to the VDP layer.

其中，所述接收方RAN将接收到的语音分组数据存入抖动缓冲区的方法为：接收方RAN判断当前抖动缓冲区是否有剩余空间，如果是，则将语音分组数据存入抖动缓冲区；否则，丢弃抖动缓冲区中最早存入的语音分组数据，然后将接收到的语音分组数据存入抖动缓冲区。Wherein, the method for the receiver RAN to store the received voice packet data into the jitter buffer is: the receiver RAN judges whether the current jitter buffer has remaining space, and if so, stores the voice packet data into the jitter buffer; Otherwise, discard the earliest voice packet data stored in the jitter buffer, and then store the received voice packet data into the jitter buffer.

其中，所述接收方RAN从抖动缓冲区取出语音分组数据至VDP层的方法为：Wherein, the method for the receiver RAN to take out the voice packet data from the jitter buffer to the VDP layer is:

A、接收方RAN判断抖动缓冲区是否为空，如果是，则转到步骤B；否则，从抖动缓冲区取出语音分组数据至VDP层，然后返回步骤A，继续判断抖动缓冲区是否为空；A, the receiver RAN judges whether the jitter buffer is empty, if yes, then proceeds to step B; otherwise, takes out the voice packet data from the jitter buffer to the VDP layer, then returns to step A, and continues to judge whether the jitter buffer is empty;

B、接收方RAN判断当前是否满足从抖动缓冲区取出数据的条件，如果是，则从抖动缓冲区取出语音分组数据至VDP层，然后返回步骤A；否则，返回步骤B，继续判断当前是否满足从抖动缓冲区取出数据的条件。B. The receiver RAN judges whether the condition for taking out data from the jitter buffer is satisfied. If yes, it takes out the voice packet data from the jitter buffer to the VDP layer, and then returns to step A; otherwise, returns to step B and continues to judge whether the current condition is satisfied. Conditions for fetching data from the jitter buffer.

其中，所述判断当前是否满足从抖动缓冲区取出数据的条件的方法为：接收方RAN判断抖动缓冲区的空间占用率是否达到已设置的缓冲区空间占用率阈值，如果是，则从抖动缓冲区取出语音分组数据至VDP层；否则，继续判断抖动缓冲区的空间占用率是否达到已设置的缓冲区空间占用率阈值。Wherein, the method for judging whether the condition for fetching data from the jitter buffer is currently met is: the receiver RAN judges whether the space occupancy rate of the jitter buffer reaches the set buffer space occupancy threshold, and if so, then from the jitter buffer area to take out the voice packet data to the VDP layer; otherwise, continue to judge whether the space occupancy rate of the jitter buffer reaches the set buffer space occupancy rate threshold.

其中，该方法进一步包括：在RAN中设置抖动缓冲定时器，Wherein, the method further includes: setting a jitter buffer timer in the RAN,

在步骤B中，判断当前是否满足从抖动缓冲区取出数据之前，进一步包括：启动抖动缓冲定时器，In step B, before judging whether the data is taken out from the jitter buffer, it further includes: starting the jitter buffer timer,

所述判断当前是否满足从抖动缓冲区取出数据的条件的方法为：The method for judging whether the condition for taking out data from the jitter buffer is satisfied at present is as follows:

B01、接收方RAN判断抖动缓冲区的空间占用率是否达到已设置的缓冲区空间占用率阈值，如果是，则从抖动缓冲区取出语音分组数据至VDP层，然后退出本流程；否则转到步骤B02；B01, the receiver RAN judges whether the space occupancy rate of the jitter buffer reaches the set buffer space occupancy threshold, if yes, then takes out the voice packet data from the jitter buffer to the VDP layer, and then exits this process; otherwise, go to step B02;

B02、接收方RAN判断抖动缓冲定时器是否超时，如果是，则从抖动缓冲区取出语音分组数据至VDP层，然后退出本流程；否则返回步骤B01。B02. The receiver RAN judges whether the jitter buffer timer has expired, and if so, takes out the voice packet data from the jitter buffer to the VDP layer, and then exits this process; otherwise, returns to step B01.

B11、接收方RAN判断抖动缓冲定时器是否超时，如果是，则从抖动缓冲区取出语音分组数据至VDP层，然后退出本流程；否则转到步骤B12；B11, the receiving side RAN judges whether the jitter buffer timer is overtime, if yes, then take out the voice packet data from the jitter buffer to the VDP layer, then exit this process; otherwise, go to step B12;

B12、接收方RAN判断抖动缓冲区的空间占用率是否达到已设置的缓冲区空间占用率阈值，如果是，则从抖动缓冲区取出语音分组数据至VDP层，然后退出本流程；否则返回步骤B11。B12, the receiver RAN judges whether the space occupancy rate of the jitter buffer reaches the set buffer space occupancy threshold, if yes, then takes out the voice packet data from the jitter buffer to the VDP layer, and then exits this process; otherwise, returns to step B11 .

其中，所述抖动缓冲区的容量为根据预测的分组网络的抖动情况和用户要求的通话质量而设置的固定值。Wherein, the capacity of the jitter buffer is a fixed value set according to the predicted jitter of the packet network and the call quality required by the user.

其中，所述抖动缓冲区的容量为接收方RAN根据当前分组网络的抖动情况而动态确定的。Wherein, the capacity of the jitter buffer is dynamically determined by the receiving RAN according to the current jitter situation of the packet network.

其中，所述根据当前分组网络的抖动情况动态确定抖动缓冲区的容量的方法为：Wherein, the method for dynamically determining the capacity of the jitter buffer according to the jitter situation of the current packet network is:

确定反映当前分组网络抖动情况的当前语音分组数据的抖动值；Determine the jitter value of the current voice packet data reflecting the jitter situation of the current packet network;

根据当前语音分组数据的抖动值确定当前抖动缓冲区的容量。Determine the capacity of the current jitter buffer according to the jitter value of the current voice packet data.

其中，所述确定当前语音分组数据的抖动值的方法为：判断当前语音分组数据是否是此次语音业务中接收方RAN接收到的第一个语音分组数据，如果是，则接收方RAN将已设置的抖动初始值作为当前语音分组数据的抖动值；否则，接收方RAN确定当前语音分组数据的传输时间，然后根据当前语音分组数据的传输时间、上一语音分组数据的传输时间和上一语音分组数据的抖动值确定当前分组数据的抖动值。Wherein, the method for determining the jitter value of the current voice packet data is: judging whether the current voice packet data is the first voice packet data received by the receiver RAN in this voice service, and if so, the receiver RAN will have The set initial jitter value is used as the jitter value of the current voice packet data; otherwise, the receiver RAN determines the transmission time of the current voice packet data, and then according to the transmission time of the current voice packet data, the transmission time of the last voice packet data and the last voice The jitter value of the packet determines the jitter value of the current packet.

其中，所述确定语音分组数据的传输时间的方法为：接收方RAN将接收到语音分组数据的时间减去包含在该数据中的数据生成时间，得到语音分组数据的传输时间。Wherein, the method for determining the transmission time of the voice packet data is: the receiver RAN subtracts the data generation time contained in the data from the time when the voice packet data is received by the receiving RAN to obtain the transmission time of the voice packet data.

其中，所述根据当前语音分组数据的传输时间、上一语音分组数据的传输时间和上一语音分组数据的抖动值确定当前语音分组数据抖动值的方法为：Wherein, the method for determining the jitter value of the current voice packet data according to the transmission time of the current voice packet data, the transmission time of the last voice packet data and the jitter value of the last voice packet data is:

抖动值_n＝抖动值_n-1+(|传输时间_n-传输时间_n-1|-抖动值_n-1)/o′，Jitter value _n = Jitter value _n-1 + (|transmission time _n - transmission time _n-1 |-jitter value _n-1) /o',

其中，抖动值_n为当前语音分组数据的抖动值，抖动值_n-1为上一语音分组数据的抖动值，传输时间_n为当前语音分组数据的传输时间，传输时间_n-1为上一语音分组数据的传输时间，o′为收敛系数。Among them, the jitter value _n is the jitter value of the current voice packet data, the jitter value _n-1 is the jitter value of the last voice packet data, the transmission time _{n is} the transmission time of the current voice packet data, and the transmission time _n-1 is the last voice The transmission time of packet data, o' is the convergence coefficient.

其中，所述收敛系数为16。Wherein, the convergence coefficient is 16.

其中，所述根据当前语音分组数据的抖动值确定抖动缓冲区容量的方法为：Wherein, the method for determining the jitter buffer capacity according to the jitter value of the current voice packet data is:

在RAN上设置最大抖动门限、最小抖动门限、最大容量门限和最小容量门限；Set the maximum jitter threshold, minimum jitter threshold, maximum capacity threshold and minimum capacity threshold on the RAN;

将当前语音分组数据的抖动值与最大抖动门限和最小抖动门限进行比较，如果当前语音分组数据的抖动值大于最大抖动门限，则判断判断当前抖动缓冲区容量是否小于最大容量门限，如果是，则增大抖动缓冲区容量；否则，保持当前抖动缓冲区容量不变；Compare the jitter value of the current voice packet data with the maximum jitter threshold and the minimum jitter threshold, if the jitter value of the current voice packet data is greater than the maximum jitter threshold, then judge whether the current jitter buffer capacity is less than the maximum capacity threshold, if yes, then Increase the jitter buffer capacity; otherwise, keep the current jitter buffer capacity unchanged;

如果当前语音分组数据的抖动值小于最小抖动门限，则判断当前抖动缓冲区容量是否大于最小容量门限，如果是，则减小抖动缓冲区容量；否则，保持当前抖动缓冲区容量不变；If the jitter value of current voice packet data is less than the minimum jitter threshold, then judge whether the current jitter buffer capacity is greater than the minimum capacity threshold, if so, then reduce the jitter buffer capacity; otherwise, keep the current jitter buffer capacity unchanged;

如果当前语音分组数据的抖动值大于或等于最小抖动门限且小于或等于最大抖动门限，保持当前抖动缓冲区的容量不变。If the jitter value of the current voice packet data is greater than or equal to the minimum jitter threshold and less than or equal to the maximum jitter threshold, keep the capacity of the current jitter buffer unchanged.

其中，所述增大抖动缓冲区容量的方法为：Wherein, the method for increasing the capacity of the jitter buffer is:

在RAN上设置容量步长值；Set the capacity step value on the RAN;

增大的抖动缓冲区容量为当前抖动缓冲区容量与容量步长值之和，The increased jitter buffer capacity is the sum of the current jitter buffer capacity and the capacity step value,

所述减小抖动缓冲区容量的方法为：The method for reducing the capacity of the jitter buffer is as follows:

在RAN上设置容量步长值；Set the capacity step value on the RAN;

减小的抖动缓冲区容量为当前抖动缓冲区容量与容量步长值之差。The reduced jitter buffer capacity is the difference between the current jitter buffer capacity and the capacity step value.

在RAN上设置容量百分比；Set capacity percentage on RAN;

增大的抖动缓冲区容量为当前抖动缓冲区容量加上该容量与容量百分比的乘积，The increased jitter buffer capacity is the current jitter buffer capacity plus the product of this capacity and the capacity percentage,

在RAN上设置容量百分比；Set capacity percentage on RAN;

减小的抖动缓冲区容量为当前抖动缓冲区容量减去该容量与容量百分比的乘积。The reduced jitter buffer capacity is the current jitter buffer capacity minus the product of this capacity and the capacity percentage.

其中，该方法进一步包括：在RAN中分配一定容量的暂时存储语音帧数据的抖动缓冲区，Wherein, the method further includes: allocating a jitter buffer of a certain capacity in the RAN to temporarily store voice frame data,

所述接收方RAN将语音分组数据转换为语音帧数据之后，进一步包括：After the receiver RAN converts the voice packet data into voice frame data, it further includes:

接收方RAN将语音帧数据存入抖动缓冲区；The receiver RAN stores the voice frame data in the jitter buffer;

接收方RAN从抖动缓冲区取出语音帧数据。The receiving RAN fetches voice frame data from the jitter buffer.

其中，所述分组网络是码分多址网络。Wherein, the packet network is a code division multiple access network.

本发明提供了一种实现语音业务的方法，该方法是在RAN的协议层上增加用于进行语音分组数据和语音帧数据互相转换的VDP层；发送方RAN接收到发送方MS发来的经过编码的语音帧数据后，由自身的VDP层将语音帧数据转换为语音分组数据，并将语音分组数据发送至分组网络；接收方RAN接收到来自分组网络的语音分组数据时，由自身的VDP层将语音分组数据转换为语音帧数据，并将语音帧数据发送至接收方MS，接收方MS对语音帧数据解码并播放。现有技术的方法是由分组语音网关实现语音分组数据和语音帧数据之间的互相转换，或由手机实现语音分组数据和语音帧数据之间的转换。从本发明和现有技术的对比来看，本发明的方法无需设置专门的设备进行语音分组数据和语音帧数据之间的转换，从而简化了网络结构，避免额外的成本；此外，本发明无需手机进行上述数据格式的转换，从而对手机没有额外的要求，同时简化了手机的操作。The present invention provides a method for realizing voice service, the method is to add a VDP layer for mutual conversion between voice packet data and voice frame data on the protocol layer of RAN; After encoding the voice frame data, the voice frame data is converted into voice packet data by its own VDP layer, and the voice packet data is sent to the packet network; when the receiver RAN receives the voice packet data from the packet network, The layer converts the voice packet data into voice frame data, and sends the voice frame data to the receiver MS, and the receiver MS decodes and plays the voice frame data. In the prior art method, the packet voice gateway realizes the mutual conversion between voice packet data and voice frame data, or the mobile phone realizes the conversion between voice packet data and voice frame data. From the comparison between the present invention and the prior art, the method of the present invention does not need to set special equipment to convert between voice packet data and voice frame data, thereby simplifying the network structure and avoiding additional costs; in addition, the present invention does not require The mobile phone performs the conversion of the above data format, so that there is no additional requirement on the mobile phone, and the operation of the mobile phone is simplified at the same time.

而且，从技术方案中可以看出，本发明的方法由接收方RAN对语音分组数据进行抖动处理，这也避免了现有技术中由于分组语音网关或手机进行抖动处理所造成的网络成本增加或手机操作复杂的问题。Moreover, it can be seen from the technical solution that in the method of the present invention, the receiver RAN performs jitter processing on the voice packet data, which also avoids the increase of network cost or the increase in network cost caused by the jitter processing of the packet voice gateway or mobile phone in the prior art. Complicated issues with mobile phone operations.

附图说明Description of drawings

图1是现有技术一的实现语音业务的系统结构示意图；FIG. 1 is a schematic structural diagram of a system for implementing voice services in prior art 1;

图2是现有技术二的实现语音业务的系统结构示意图；FIG. 2 is a schematic structural diagram of a system for implementing voice services in the second prior art;

图3是本发明的RAN的协议层的结构示意图；FIG. 3 is a schematic structural diagram of the protocol layer of the RAN of the present invention;

图4是根据本发明的实现语音业务的系统结构示意图；Fig. 4 is a schematic structural diagram of a system for realizing a voice service according to the present invention;

图5是根据本发明的实现语音业务的方法流程图；Fig. 5 is the flow chart of the method for realizing voice service according to the present invention;

图6是接收方RAN向抖动缓冲区存入语音分组数据的流程图。Fig. 6 is a flowchart of storing voice packet data into the jitter buffer by the receiving RAN.

图7是接收方RAN从抖动缓冲区取出语音分组数据的流程图。Fig. 7 is a flow chart of the receiving RAN fetching the voice packet data from the jitter buffer.

图8是根据本发明的动态调整抖动缓冲区大小的方法流程图。Fig. 8 is a flowchart of a method for dynamically adjusting the size of a jitter buffer according to the present invention.

具体实施方式Detailed ways

为了使本发明的目的、技术方案和优点更清楚，下面结合附图和具体实施方式对本发明作进一步描述。In order to make the purpose, technical solution and advantages of the present invention clearer, the present invention will be further described below in conjunction with the accompanying drawings and specific embodiments.

本发明的方法在图2所示的组网方式的基础上对现有技术加以改进，在RAN上增加语音数据协议(VDP)层，图3是本发明的RAN的协议层的结构示意图，从图3中可以看出，本发明的RAN在与分组网络交互的协议层中，除了包括原有的分组传输层、链路层和物理层之外，还在分组传输层的上面增加了VDP层，以实现语音帧数据和语音分组数据之间的互相转换。使RAN具有将语音帧数据转换为语音分组数据，并将语音分组数据转换为语音帧数据的功能。The method of the present invention improves the prior art on the basis of the networking mode shown in Figure 2, and increases the voice data protocol (VDP) layer on the RAN, and Figure 3 is a structural representation of the protocol layer of the RAN of the present invention, from It can be seen from Fig. 3 that in the protocol layer interacting with the packet network, the RAN of the present invention, in addition to including the original packet transport layer, link layer and physical layer, also adds a VDP layer above the packet transport layer , to realize the conversion between voice frame data and voice packet data. The RAN has the function of converting speech frame data into speech packet data and converting speech packet data into speech frame data.

图4是根据本发明的实现语音业务的系统结构示意图，本发明的方法在发送方MS上将语音数据以语音帧的格式传送至发送方RAN，发送方RAN将语音帧格式转换为语音分组格式，然后发送至分组网络，分组网络将语音分组数据传输至接收方RAN，接收方RAN对接收到的语音分组数据进行抖动处理，然后将语音分组数据转换为语音帧数据，并发送至接收方MS，接收方MS将语音帧格式的数据直接解码为语音信号。Fig. 4 is a schematic structural diagram of a system for realizing voice services according to the present invention. The method of the present invention transmits voice data to the sender RAN in the format of voice frames on the sender MS, and the sender RAN converts the voice frame format into a voice packet format , and then sent to the packet network, the packet network transmits the voice packet data to the receiver RAN, and the receiver RAN performs jitter processing on the received voice packet data, and then converts the voice packet data into voice frame data, and sends it to the receiver MS , the receiver MS directly decodes the data in the voice frame format into a voice signal.

抖动处理是接收方RAN将接收到的语音分组数据存入预先设置的抖动缓冲区，在满足一定取出条件时将语音分组数据从抖动缓冲区取出，然后执行将语音分组数据转换为语音帧数据的步骤。The jitter processing is that the receiver RAN stores the received voice packet data into the preset jitter buffer, takes out the voice packet data from the jitter buffer when certain extraction conditions are met, and then executes the process of converting the voice packet data into voice frame data step.

图5是根据本发明的实现语音业务的方法流程图，从图5中可以看出，本发明的方法包括如下步骤：Fig. 5 is the flow chart of the method for realizing voice service according to the present invention, as can be seen from Fig. 5, the method of the present invention comprises the following steps:

步骤501：在发送方RAN和接收方RAN上分别增加VDP层和用于进行抖动处理的抖动缓冲区，并设置抖动缓冲区的空间占用率阈值和缓冲定时器。Step 501: Add a VDP layer and a jitter buffer for jitter processing on the sending RAN and the receiving RAN respectively, and set the space occupancy threshold and buffer timer of the jitter buffer.

VDP层的作用是实现语音帧数据和分组数据之间的互相转换，在RAN上增加VDP层可使RAN具有将语音帧数据转换为语音分组数据，并将语音分组数据转换为语音帧数据的功能。The role of the VDP layer is to realize the mutual conversion between voice frame data and packet data. Adding the VDP layer on the RAN can make the RAN have the function of converting voice frame data into voice packet data and converting voice packet data into voice frame data. .

步骤502：发送方MS对用户的语音数据进行编码，将语音转换为数字比特流，在空口以语音帧的形式将语音数据发送至RAN。Step 502: The sender MS encodes the voice data of the user, converts the voice into a digital bit stream, and sends the voice data to the RAN in the form of voice frames over the air interface.

对语音数据进行编码的方式有多种，例如，可以采用增强可变速率编解码(EVRC，Enhanced Variable Rate Codec)或高通提出的码激励线性预测编码(QCELP，QUALCOMM Code Excited Linear Prediction)两种编码方式。语音帧数据也可以采用不同的格式，例如，可以采用20ms/帧的格式，即每帧中存储20ms的语音数据。There are many ways to encode voice data, for example, two types of encoding can be adopted: EVRC (Enhanced Variable Rate Codec) or QCELP (QUALCOMM Code Excited Linear Prediction) proposed by Qualcomm Way. The speech frame data can also adopt different formats, for example, the format of 20ms/frame can be adopted, that is, the speech data of 20ms is stored in each frame.

步骤503：发送方RAN接收到发送方MS发来的语音帧数据之后，在VDP层按照一定的语音分组方式将语音帧数据转换为语音分组数据，并填写目的地址，然后将语音分组数据传入分组网络。Step 503: After receiving the voice frame data sent by the sender MS, the sender RAN converts the voice frame data into voice packet data according to a certain voice packet method at the VDP layer, fills in the destination address, and then transmits the voice packet data packet network.

语音分组方式有多种，例如，可以将一个语音帧打包为一个语音分组，或将多个语音帧打包为一个语音分组，每个语音分组中包含的语音帧的数目由设置的参数确定，且数目的多寡各有利弊：每个语音分组包含的语音帧越少则网络延时越小，但语音分组的数量多，导致分组网络负担增大；每个语音分组包含的语音帧越多则网络延时越大，但语音分组的数量少，所以分组网络的负担较小。由于存在上述利弊，所以需要根据不同时刻的网络实测的语音质量和网络负载动态调整控制语音分组方式的参数，以确定组建语音分组的语音帧数。There are many voice grouping methods. For example, one voice frame can be packaged into one voice group, or multiple voice frames can be packaged into one voice group. The number of voice frames contained in each voice group is determined by the set parameters, and The number has its own advantages and disadvantages: the fewer voice frames each voice packet contains, the smaller the network delay, but the larger the number of voice packets, the larger the burden on the packet network; the more voice frames each voice packet contains, the slower the network delay will be. The longer the delay, the smaller the number of voice packets, so the burden on the packet network is smaller. Due to the above advantages and disadvantages, it is necessary to dynamically adjust the parameters controlling the voice grouping mode according to the voice quality and network load measured on the network at different times, so as to determine the number of voice frames for forming a voice group.

步骤504：接收方RAN利用已经设置的抖动缓冲区对分组网络发来的语音分组数据进行抖动处理。Step 504: The receiver RAN uses the set jitter buffer to perform jitter processing on the voice packet data sent by the packet network.

步骤505：接收方RAN在VDP层将语音分组数据转换为语音帧数据，并将语音帧数据发送至接收方MS。Step 505: the receiver RAN converts the voice packet data into voice frame data at the VDP layer, and sends the voice frame data to the receiver MS.

步骤506：接收方MS按照发送方MS的帧格式对接收到的语音帧数据解码，并播放语音数据。Step 506: The receiver MS decodes the received voice frame data according to the frame format of the sender MS, and plays the voice data.

在接收方RAN不向MS发送语音数据的时候，为了不使接收方MS的用户感到明显的话音中断，可以发送舒适噪声至接收方MS，接收方MS解码该舒适噪声并放音，舒适噪声是指对人体不产生刺激的特殊噪声。When the receiving RAN does not send voice data to the MS, in order not to make the user of the receiving MS feel obvious voice interruption, it can send comfort noise to the receiving MS, and the receiving MS decodes the comfort noise and plays the sound. The comfort noise is Refers to the special noise that does not stimulate the human body.

在步骤504中，接收方RAN对分组网络发来的语音分组数据进行抖动处理，抖动处理的流程由两个并行的流程组成，一个是接收方RAN向抖动缓冲区存入语音分组数据的流程，另一个是接收方RAN从抖动缓冲区取出语音分组数据的流程。下面详细介绍抖动处理的两个流程。In step 504, the receiving side RAN performs jitter processing on the voice packet data sent by the packet network, and the flow process of the jitter processing is composed of two parallel processes, one is that the receiving side RAN stores the voice packet data into the jitter buffer, The other is the process that the receiver RAN takes out voice packet data from the jitter buffer. The two processes of dithering processing are introduced in detail below.

图6是接收方RAN向抖动缓冲区存入语音分组数据的流程图，从图中可以看出，该流程包括如下步骤：Fig. 6 is the flow chart of receiver RAN storing voice packet data into the jitter buffer, as can be seen from the figure, the process includes the following steps:

步骤601：接收方RAN判断是否接收到分组网络发来的语音分组数据，如果是，则转到步骤602；否则，返回到步骤601，继续判断是否接收到语音分组数据。Step 601: The receiver RAN judges whether it has received voice packet data from the packet network, and if yes, then proceeds to step 602; otherwise, returns to step 601, and continues to judge whether voice packet data is received.

步骤602：接收方RAN判断抖动缓冲区是否有剩余空间，如果是，则转到步骤604；否则，转到步骤603。Step 602: the receiving RAN judges whether there is remaining space in the jitter buffer, if yes, go to step 604; otherwise, go to step 603.

步骤603：接收方RAN舍弃抖动缓冲区中最早存入的语音分组数据，然后转到步骤604。Step 603: the receiver RAN discards the earliest voice packet data stored in the jitter buffer, and then goes to step 604.

步骤604：接收方RAN将语音分组数据存入抖动缓冲区，然后重复执行步骤601。Step 604: the receiver RAN stores the voice packet data in the jitter buffer, and then repeatedly executes step 601.

图7是接收方RAN从抖动缓冲区取出语音分组数据的流程图，从图中可以看出，该流程包括如下步骤：Fig. 7 is the flow chart of receiving side RAN taking out voice packet data from the jitter buffer, as can be seen from the figure, the process includes the following steps:

步骤701：接收方RAN判断抖动缓冲区是否为空，如果是，则转到步骤702；否则，转到步骤704。Step 701: The receiving RAN judges whether the jitter buffer is empty, if yes, go to step 702; otherwise, go to step 704.

步骤702：接收方RAN启动抖动缓冲定时器。Step 702: The receiving RAN starts the jitter buffer timer.

步骤703：接收方RAN判断当前是否满足从抖动缓冲区中取出语音分组数据的条件，有两种判断方式，可任选其一：Step 703: The receiver RAN judges whether the condition for taking out the voice packet data from the jitter buffer is satisfied at present. There are two judgment methods, one of which can be selected:

(1)接收方RAN判断抖动缓冲区的空间占用率是否达到了设定的缓冲区容量的空间占用率阈值，如果达到阈值，则转到步骤704；如果未达到阈值，则判断抖动缓冲定时器是否超时，如果超时，则转到步骤704；如果未超时，则返回步骤703。(1) The receiver RAN judges whether the space occupancy rate of the jitter buffer has reached the space occupancy threshold of the buffer capacity set, if it reaches the threshold, then go to step 704; if it does not reach the threshold, then judge the jitter buffer timer Whether it is overtime, if overtime, go to step 704; if not overtime, then return to step 703.

(2)接收方RAN判断抖动缓冲定时器是否超时，如果超时，则转到步骤704；如果未超时，则判断缓冲区的空间占用率是否达到空间占用率阈值，如果达到阈值，则转到步骤704；如果未达到阈值，则返回步骤703。(2) the receiver RAN judges whether the jitter buffer timer is overtime, if overtime, then go to step 704; if not overtime, then judge whether the space occupancy rate of the buffer zone reaches the space occupancy rate threshold, if it reaches the threshold value, then go to step 704 704 ; if the threshold is not reached, return to step 703 .

步骤704：接收方RAN从抖动缓冲区取出语音分组数据至VDP层。Step 704: The receiver RAN fetches the voice packet data from the jitter buffer to the VDP layer.

需要说明，在从抖动缓冲区取出语音分组数据的方案中，也可以不设置抖动缓冲定时器，而直接通过判断当前抖动缓冲区的空间占用率是否达到空间占用率阈值来判断当前是否满足取出语音分组数据的条件。It should be noted that in the scheme of extracting voice packet data from the jitter buffer, it is also possible not to set the jitter buffer timer, but directly judge whether the space occupancy rate of the current jitter buffer reaches the space occupancy threshold to determine whether the current condition for extracting voice is satisfied. Criteria for grouping data.

另外，抖动缓冲区的容量可以是网络配置的参数，也可以是根据分组网络的抖动情况而动态更改的参数。上述实施例属于前一种情况，在对网络进行配置的时候根据预测的分组网络的抖动情况和用户要求的话音质量来调整缓冲区的容量，一旦配置完毕，则不能自动调整该参数；在后一种情况下，需要设定一个初始的抖动缓冲区容量，然后根据分组网络的抖动情况动态更改该容量，动态更改抖动缓冲区容量的方法是：在接收方RAN接收到新的语音分组数据之后，根据接收到的语音分组数据计算抖动值，并根据抖动值调整抖动缓冲区的容量。In addition, the capacity of the jitter buffer can be a parameter configured by the network, or a parameter that can be changed dynamically according to the jitter situation of the packet network. The foregoing embodiments belong to the former case. When the network is configured, the capacity of the buffer zone is adjusted according to the predicted jitter situation of the packet network and the voice quality required by the user. Once the configuration is completed, this parameter cannot be automatically adjusted; In one case, it is necessary to set an initial jitter buffer capacity, and then dynamically change the capacity according to the jitter of the packet network. The method of dynamically changing the jitter buffer capacity is: after the receiver RAN receives new voice packet data , calculate the jitter value according to the received voice packet data, and adjust the capacity of the jitter buffer according to the jitter value.

下面对动态调整抖动缓冲区容量的方法予以说明。The method for dynamically adjusting the capacity of the jitter buffer is described below.

为了动态调整抖动缓冲区容量，除了在RAN上增加VDP层和抖动缓冲区并设置各项参数以外，还需要设置一些动态调整抖动缓冲区容量所需的参数，诸如初始抖动值、最大抖动门限、最小抖动门限、抖动缓冲区最大容量门限和最小容量门限。In order to dynamically adjust the jitter buffer capacity, in addition to adding the VDP layer and jitter buffer on the RAN and setting various parameters, it is also necessary to set some parameters required for dynamically adjusting the jitter buffer capacity, such as initial jitter value, maximum jitter threshold, The minimum jitter threshold, the maximum capacity threshold and the minimum capacity threshold of the jitter buffer.

由于实施动态调整抖动缓冲区的前提是记录发送方RAN生成语音分组数据的时间和接收方RAN接收到语音分组数据的时间，所以，在发送方RAN组成语音分组数据时，记录当前的系统时间并将该时间作为分组数据生成时间附加在语音分组数据中；在接收方RAN接收到来自分组网络的语音数据时，记录当前的系统时间作为分组数据到达时间。Since the premise of implementing dynamic adjustment of the jitter buffer is to record the time when the sender RAN generates the voice packet data and the time when the receiver RAN receives the voice packet data, when the sender RAN forms the voice packet data, record the current system time and Add this time as the packet data generation time to the voice packet data; when the receiver RAN receives the voice data from the packet network, record the current system time as the packet data arrival time.

在接收方RAN接收到来自分组网络的语音分组数据之后，对该语音分组数据进行抖动处理之前，增加动态调整缓冲区容量的步骤，如图8所示，本发明的动态调整抖动缓冲区容量的步骤如下：After the receiver RAN receives the voice packet data from the packet network, before performing jitter processing on the voice packet data, the step of dynamically adjusting the buffer capacity is added, as shown in Figure 8, the method of dynamically adjusting the jitter buffer capacity of the present invention Proceed as follows:

步骤801：接收方RAN按照公式(1)计算抖动值：Step 801: The receiver RAN calculates the jitter value according to formula (1):

抖动值_n＝抖动值_n-1+(|传输时间_n-传输时间_n-1|-抖动值_n-1)/o′(1)Jitter value _n = Jitter value _n-1 + (|transmission time _n - transmission time _n-1 |-jitter value _n-1 )/o'(1)

其中，抖动值_n是当前计算所得的抖动值；抖动值_n-1是上一次计算得到的抖动值；传输时间_n是当前分组数据在网络上的传输时间，其值是当前分组数据到达时间与分组中携带的分组数据生成时间的差；传输时间_n-1是上一个分组数据在网络上的传输时间；o′是可根据网络情况进行适当调整的收敛系数，优选地，其值可取16。Among them, the jitter value _n is the jitter value calculated currently; the jitter value _n-1 is the jitter value calculated last time; the transmission time _n is the transmission time of the current packet data on the network, and its value is the difference between the arrival time of the current packet data and The difference between the generation time of the packet data carried in the packet; the transmission time _n-1 is the transmission time of the last packet data on the network; o' is a convergence coefficient that can be adjusted appropriately according to the network situation, preferably, its value can be 16.

如果当前接收到的语音分组数据是第一个语音分组数据，首先计算分组数据的传输时间，传输时间为分组数据到达时间与分组数据生成时间之差，此时不计算抖动值，抖动值为设置的初始抖动值。If the currently received voice packet data is the first voice packet data, first calculate the transmission time of the packet data. The transmission time is the difference between the arrival time of the packet data and the generation time of the packet data. At this time, the jitter value is not calculated, and the jitter value is set initial jitter value.

从接收方RAN接收到第二个语音分组数据之后，将上一个语音分组数据的传输时间作为传输时间_n-1，将上一个语音分组数据的抖动值作为抖动值_n-1，并将当前分组数据的传输时间作为传输时间_n，用公式(1)计算抖动值_n。After receiving the second voice packet data from the receiver RAN, the transmission time of the last voice packet data is taken as the transmission time _n-1 , the jitter value of the last voice packet data is taken as the jitter value _n-1 , and the current packet The transmission time of the data is taken as the transmission time _n , and the jitter value _n is calculated with the formula (1).

步骤802：将计算所得的当前时刻的抖动值与最大抖动门限和最小抖动门限进行比较，如果抖动值大于最大抖动门限，则转到步骤803；如果抖动值小于最小抖动门限，则转到步骤804；如果抖动值在最大抖动门限和最小抖动门限之间，包括最大抖动门限和最小抖动门限，则转到步骤805。Step 802: Compare the calculated jitter value at the current moment with the maximum jitter threshold and the minimum jitter threshold, if the jitter value is greater than the maximum jitter threshold, go to step 803; if the jitter value is smaller than the minimum jitter threshold, then go to step 804 ; If the jitter value is between the maximum jitter threshold and the minimum jitter threshold, including the maximum jitter threshold and the minimum jitter threshold, go to step 805 ;

步骤803：判断当前抖动缓冲区的容量是否小于最大容量门限，如果是，则转到步骤806；否则，转到步骤805。Step 803: Judging whether the capacity of the current jitter buffer is smaller than the maximum capacity threshold, if yes, go to step 806; otherwise, go to step 805.

步骤804：判断当前抖动缓冲区的容量是否大于最小容量门限，如果是，则转到步骤807；否则，转到步骤805。Step 804: Judging whether the capacity of the current jitter buffer is greater than the minimum capacity threshold, if yes, go to step 807; otherwise, go to step 805.

步骤805：保持当前抖动缓冲区的容量不变，然后结束。Step 805: Keep the current capacity of the jitter buffer unchanged, and then end.

步骤806：增加将抖动缓冲区的容量，然后结束。Step 806: Increase the capacity of the buffer to be jittered, and then end.

增加容量的方法有两种：一种是增加固定的步长值；另一种是增加当前容量的一定百分比，例如，增加当前容量的20％。There are two ways to increase capacity: one is to increase a fixed step value; the other is to increase a certain percentage of the current capacity, for example, increase 20% of the current capacity.

步骤807：减小将抖动缓冲区的容量。Step 807: Reduce the capacity of the buffer to be jittered.

减小容量的方法有两种：一种是减小固定的步长值；另一种是减小当前容量的一定百分比，例如，减小当前容量的20％。There are two ways to reduce the capacity: one is to reduce the fixed step value; the other is to reduce a certain percentage of the current capacity, for example, reduce the current capacity by 20%.

在确定了缓冲区容量之后，接收方RAN在VDP层利用抖动缓冲区对语音分组数据进行抖动处理。After determining the buffer capacity, the receiver RAN uses the jitter buffer to perform jitter processing on the voice packet data at the VDP layer.

在上述的实现分组语音业务方法中，接收方RAN对接收到的语音分组数据进行抖动处理，然后将语音分组数据转换为语音帧数据。在实际处理中，接收方RAN也可先将接收到的语音分组数据转换为语音帧数据，然后对语音帧数据进行抖动处理，对语音帧数据进行抖动处理的方法也是利用抖动缓冲区进行抖动处理，在这种情况下，抖动缓冲区中存储的是语音帧数据，而不是语音分组数据。抖动缓冲区的容量可以是预先设置的固定值，也可以如前文所述，根据当前分组网络的抖动情况计算抖动值，并根据计算出的抖动值动态地改变抖动缓冲区的容量。In the above-mentioned method for implementing packet voice services, the receiving RAN performs dithering processing on the received voice packet data, and then converts the voice packet data into voice frame data. In actual processing, the receiver RAN can also first convert the received voice packet data into voice frame data, and then perform dithering processing on the voice frame data. The method of dithering voice frame data is also to use the dithering buffer for dithering processing , in this case, voice frame data is stored in the jitter buffer instead of voice packet data. The capacity of the jitter buffer can be a preset fixed value, or as mentioned above, the jitter value can be calculated according to the current jitter situation of the packet network, and the capacity of the jitter buffer can be dynamically changed according to the calculated jitter value.

在具体的实施过程中可对根据本发明的方法进行适当的改进，以适应具体情况的具体需要。因此可以理解，根据本发明的具体实施方式只是起示范作用，并不用以限制本发明的保护范围，例如，本发明不局限于在CDMA网络中应用，也可应用在其它类型的分组网络中。Appropriate improvements can be made to the method according to the present invention in the specific implementation process to meet the specific needs of specific situations. Therefore, it can be understood that the specific implementation mode according to the present invention is only for demonstration, and is not intended to limit the scope of protection of the present invention. For example, the present invention is not limited to application in CDMA networks, and can also be applied in other types of packet networks.

Claims

1. A method for realizing voice service, characterized in that the method comprises the steps:

Add a voice data protocol VDP layer for mutual conversion between voice packet data and voice frame data on the protocol layer of the radio access network RAN;

After the sender RAN receives the encoded voice frame data from the sender mobile station MS, its own VDP layer converts the voice frame data into voice packet data, and sends the voice packet data to the packet network;

After receiving the voice packet data from the packet network, the receiver RAN converts the voice packet data into voice frame data by its own VDP layer, and sends the voice frame data to the receiver MS, and the receiver MS decodes and plays the voice frame data .

2. The method for realizing voice services according to claim 1, characterized in that the method further comprises: in the RAN, allocate a jitter buffer with a certain capacity for temporarily storing voice packet data for each connected link,

After the receiving side RAN receives the voice packet data, before converting the voice packet data into voice frame data, it further includes: the receiving side RAN stores the received voice packet data into the jitter buffer, and when the fetching condition is satisfied, from The jitter buffer takes voice packet data to the VDP layer.

3. The method for realizing voice services according to claim 2, characterized in that, the receiver RAN stores the received voice packet data into the jitter buffer as follows: the receiver RAN judges whether the current jitter buffer has The remaining space, if yes, then store the voice packet data into the jitter buffer; otherwise, discard the earliest voice packet data stored in the jitter buffer, and then store the received voice packet data into the jitter buffer.

4. The method for realizing a voice service according to claim 2, wherein the method for the receiver RAN to take out the voice packet data from the jitter buffer to the VDP layer is:

A, the receiver RAN judges whether the jitter buffer is empty, if yes, then proceeds to step B; otherwise, takes out the voice packet data from the jitter buffer to the VDP layer, then returns to step A, and continues to judge whether the jitter buffer is empty;

B. The receiver RAN judges whether the condition for taking out data from the jitter buffer is satisfied. If yes, it takes out the voice packet data from the jitter buffer to the VDP layer, and then returns to step A; otherwise, returns to step B and continues to judge whether the current condition is satisfied. Conditions for fetching data from the jitter buffer.

5. The method for realizing a voice service according to claim 4, wherein the method for judging whether the condition for taking out data from the jitter buffer is satisfied at present is: the receiver RAN judges whether the space occupancy rate of the jitter buffer reaches The buffer space occupancy rate threshold that has been set, if yes, then take out the voice packet data from the jitter buffer to the VDP layer; otherwise, continue to judge whether the space occupancy rate of the jitter buffer reaches the set buffer space occupancy rate threshold.

6. The method for implementing voice services according to claim 4, characterized in that the method further comprises: setting a jitter buffer timer in the RAN,

In step B, before judging whether the data is taken out from the jitter buffer, it further includes: starting the jitter buffer timer,

The method for judging whether the condition for taking out data from the jitter buffer is satisfied at present is as follows:

B01, the receiver RAN judges whether the space occupancy rate of the jitter buffer reaches the set buffer space occupancy threshold, if yes, then takes out the voice packet data from the jitter buffer to the VDP layer, and then exits this process; otherwise, go to step B02;

B02. The receiver RAN judges whether the jitter buffer timer has expired, and if so, takes out the voice packet data from the jitter buffer to the VDP layer, and then exits this process; otherwise, returns to step B01.

7. The method for implementing voice services according to claim 4, characterized in that the method further comprises: setting a jitter buffer timer in the RAN,

B11, the receiving side RAN judges whether the jitter buffer timer is overtime, if yes, then take out the voice packet data from the jitter buffer to the VDP layer, then exit this process; otherwise, go to step B12;

B12, the receiver RAN judges whether the space occupancy rate of the jitter buffer reaches the set buffer space occupancy threshold, if yes, then takes out the voice packet data from the jitter buffer to the VDP layer, and then exits this process; otherwise, returns to step B11 .

8. The method for implementing voice services according to claim 2, wherein the capacity of the jitter buffer is a fixed value set according to the predicted jitter of the packet network and the call quality required by the user.

9. The method for implementing voice services according to claim 2, characterized in that the capacity of the jitter buffer is dynamically determined by the receiver RAN according to the current jitter of the packet network.

10. The method for realizing voice service according to claim 9, characterized in that, the method for dynamically determining the capacity of the jitter buffer according to the jitter situation of the current packet network is:

Determine the jitter value of the current voice packet data reflecting the jitter situation of the current packet network;

Determine the capacity of the current jitter buffer according to the jitter value of the current voice packet data.

11. The method for implementing voice services according to claim 10, characterized in that the method for determining the jitter value of the current voice packet data is: judging whether the current voice packet data is received by the receiver RAN in this voice service If it is, the receiver RAN will use the set initial jitter value as the jitter value of the current voice packet data; otherwise, the receiver RAN determines the transmission time of the current voice packet data, and then according to the current voice packet The transmission time of the data, the transmission time of the last voice packet data and the jitter value of the last voice packet data determine the jitter value of the current packet data.

12. The method for implementing voice services according to claim 11, characterized in that the method for determining the transmission time of the voice packet data is: the receiver RAN subtracts the time of receiving the voice packet data from the time included in the data The data generation time of the data is obtained to obtain the transmission time of the voice packet data.

13. The method for implementing voice services according to claim 11, characterized in that the current voice is determined according to the transmission time of the current voice packet data, the transmission time of the last voice packet data and the jitter value of the last voice packet data The method for grouping data jitter values is:

Jitter value _n = Jitter value _n-1 + (|transmission time _n - transmission time _n-1 |-jitter value _n-1 )/σ,

Among them, the jitter value _n is the jitter value of the current voice packet data, the jitter value _n-1 is the jitter value of the last voice packet data, the transmission time _{n is} the transmission time of the current voice packet data, and the transmission time _n-1 is the last voice The transmission time of packet data, σ is the convergence coefficient.

14. The method for implementing voice services according to claim 13, wherein the convergence coefficient is 16.

15. The method for implementing voice services according to claim 10, characterized in that the method for determining the capacity of the jitter buffer according to the jitter value of the current voice packet data is:

Set the maximum jitter threshold, minimum jitter threshold, maximum capacity threshold and minimum capacity threshold on the RAN;

Compare the jitter value of the current voice packet data with the maximum jitter threshold and the minimum jitter threshold, if the jitter value of the current voice packet data is greater than the maximum jitter threshold, then judge whether the current jitter buffer capacity is less than the maximum capacity threshold, if yes, then Increase the jitter buffer capacity; otherwise, keep the current jitter buffer capacity unchanged;

If the jitter value of current voice packet data is less than the minimum jitter threshold, then judge whether the current jitter buffer capacity is greater than the minimum capacity threshold, if so, then reduce the jitter buffer capacity; otherwise, keep the current jitter buffer capacity unchanged;

If the jitter value of the current voice packet data is greater than or equal to the minimum jitter threshold and less than or equal to the maximum jitter threshold, keep the capacity of the current jitter buffer unchanged.

16. The method for implementing voice services according to claim 15, wherein the method for increasing the capacity of the jitter buffer is:

Set the capacity step value on the RAN;

The increased jitter buffer capacity is the sum of the current jitter buffer capacity and the capacity step value,

The method for reducing the capacity of the jitter buffer is as follows:

Set the capacity step value on the RAN;

The reduced jitter buffer capacity is the difference between the current jitter buffer capacity and the capacity step value.

17. The method for implementing voice services according to claim 15, wherein the method for increasing the capacity of the jitter buffer is:

Set capacity percentage on RAN;

The increased jitter buffer capacity is the current jitter buffer capacity plus the product of this capacity and the capacity percentage,

The method for reducing the capacity of the jitter buffer is as follows:

Set capacity percentage on RAN;

The reduced jitter buffer capacity is the current jitter buffer capacity minus the product of this capacity and the capacity percentage.

18. The method for implementing voice services according to claim 1, characterized in that the method further comprises: allocating a jitter buffer with a certain capacity in the RAN to temporarily store voice frame data,

After the receiver RAN converts the voice packet data into voice frame data, it further includes:

The receiver RAN stores the voice frame data in the jitter buffer;

The receiver RAN fetches voice frame data from the jitter buffer.

19. The method for implementing voice services according to claim 1, wherein the packet network is a code division multiple access network.