US12488803B2 - Method and system for encoding and wirelessly transmitting stereo audio content for audio communication - Google Patents
Method and system for encoding and wirelessly transmitting stereo audio content for audio communicationInfo
- Publication number
- US12488803B2 US12488803B2 US17/741,874 US202217741874A US12488803B2 US 12488803 B2 US12488803 B2 US 12488803B2 US 202217741874 A US202217741874 A US 202217741874A US 12488803 B2 US12488803 B2 US 12488803B2
- Authority
- US
- United States
- Prior art keywords
- audio
- packets
- encoded
- stereo
- pair
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/167—Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
- G10L19/24—Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L1/00—Arrangements for detecting or preventing errors in the information received
- H04L1/004—Arrangements for detecting or preventing errors in the information received by using forward error control
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/60—Substation equipment, e.g. for use by subscribers including speech amplifiers
- H04M1/6033—Substation equipment, e.g. for use by subscribers including speech amplifiers for providing handsfree use or a loudspeaker mode in telephone sets
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/60—Substation equipment, e.g. for use by subscribers including speech amplifiers
- H04M1/6033—Substation equipment, e.g. for use by subscribers including speech amplifiers for providing handsfree use or a loudspeaker mode in telephone sets
- H04M1/6041—Portable telephones adapted for handsfree use
- H04M1/6058—Portable telephones adapted for handsfree use involving the use of a headset accessory device connected to the portable telephone
- H04M1/6066—Portable telephones adapted for handsfree use involving the use of a headset accessory device connected to the portable telephone including a wireless connection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/422—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
- H04N21/42203—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/436—Interfacing a local distribution network, e.g. communicating with another STB or one or more peripheral devices inside the home
- H04N21/4363—Adapting the video stream to a specific local network, e.g. a Bluetooth® network
- H04N21/43637—Adapting the video stream to a specific local network, e.g. a Bluetooth® network involving a wireless protocol, e.g. Bluetooth, RF or wireless LAN [IEEE 802.11]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/485—End-user interface for client configuration
- H04N21/4852—End-user interface for client configuration for modifying audio parameters, e.g. switching between mono and stereo
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R5/00—Stereophonic arrangements
- H04R5/033—Headphones for stereophonic communication
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R5/00—Stereophonic arrangements
- H04R5/04—Circuit arrangements, e.g. for selective connection of amplifier inputs/outputs to loudspeakers, for loudspeaker detection, or for adaptation of settings to personal preferences or hearing impairments
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/80—Services using short range communication, e.g. near-field communication [NFC], radio-frequency identification [RFID] or low energy communication
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2250/00—Details of telephonic subscriber devices
- H04M2250/02—Details of telephonic subscriber devices including a Bluetooth interface
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2420/00—Details of connection covered by H04R, not provided for in its groups
- H04R2420/07—Applications of wireless loudspeakers or wireless microphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/03—Application of parametric coding in stereophonic audio systems
Definitions
- This disclosure relates to the field of audio communication including techniques for encoding and packetizing speech as stereo audio signals for transmissions over a Bluetooth link. Other aspects are also described.
- Wireless headsets may be communicatively coupled to electronic devices via a wireless communication protocol such as the Bluetooth protocol for wearers of the wireless headsets to engage in voice calls with remote users, listen to media content, issue voice commands, obtain query responses, etc.
- a wireless communication protocol such as the Bluetooth protocol for wearers of the wireless headsets to engage in voice calls with remote users, listen to media content, issue voice commands, obtain query responses, etc.
- the Bluetooth protocol is typically configured to transmit monophonic content in either direction to reduce audio processing complexity and latency.
- the Bluetooth protocol may be configured in the Hands-Free Profile (HFP) for conducting hands-free calls using a single audio channel.
- HFP Hands-Free Profile
- HFP for monophonic audio signals limits the data transmission to a maximum data rate of 64 kilobits-per-second (kbps) by digitally encoding speech signals and packetizing encoded audio frames into a 60-byte packet every 7.5 ms.
- the data throughput constraint limits the sampling rate of the speech signals to 16 KHz, reducing the bandwidth of the speech signals in high-fidelity applications and compromising the audio quality of the audio signals rendered at the wireless headsets.
- Audio applications that process speech signals such as those used in telephony, videoconferencing, or voice query context may enhance audio quality by transmitting and receiving audio signals in stereo format carried by two or more audio channels over a Bluetooth link.
- videoconferencing applications may transmit stereoscopic speech signals to allow a wireless headset to spatially render the speech signals to create the perception of directionality of the speaker to the listener.
- a listener carrying on a conversation with another person using a videoconferencing application may listen to music streamed in stereo format.
- Bluetooth protocol that is conventionally configured to transmit monophonic content may be adapted to transmit speech signals in stereo format in both directions to enhance the quality of audio signals rendered at a receiving device.
- Bluetooth communication protocols such as the Hands-Free Profile (HFP) or the Headset Profile (HSP) may be used to exchange digitized audio data over a bi-directional wireless link between a source device and a listening or playback device such as a wireless headset.
- HFP Hands-Free Profile
- HSP Headset Profile
- These profiles may support “voice-quality” or low-quality single channel or monophonic audio communication between the devices.
- mono HFP typically uses codecs (e.g., low complexity modified sub-band codec (mSBC)) with a sampling rate of 16 KHz.
- mSBC low complexity modified sub-band codec
- Mono HFP may be expanded to support Advanced Audio Coding-Enhanced Low Delay (AAC-ELD) codec with a higher sampling rate of 24 KHz to generate audio frames having a duration of 7.5 ms and a frame size of 180 samples.
- AAC-ELD Advanced Audio Coding-Enhanced Low Delay
- the audio frames may be packetized into ELD packets and the ELD packets assembled into Bluetooth 2-EV3 transport packets having a packet size of 60 bytes every 7.5 ms, yielding a maximum bit rate of 64 kbps.
- the sampling rate and the throughput of the mono HFP may support 12 KHz wide-band (WB) uplink and downlink monophonic audio, but the configuration is insufficient to support the higher throughput needed for bi-directional stereo audio.
- Mono HFP is also insufficient to support the higher sampling rate needed for higher bandwidth audio such as 16 KHz super wide-band (SWB) or 24 KHz full-band (FB) audio.
- the 2-EV3 transport packets of the mono HFP with a packet size of 60 bytes and a packet duration of 7.5 ms may be expanded to use the 2-EV5 transport packets having a packet size of 360 bytes and a packet duration of 15 ms.
- the newer 2-EV5 transport packets is an enhancement to the Bluetooth protocol to provide larger Bluetooth transport packets at a longer duty cycle.
- Using the 2-EV5 transport packets allows a tripling of the data throughput from 64 to 192 kbps. The increased throughput may be used to support stereo WB audio signals based on the same sampling rate and the same block size of audio frames used for mono HFP.
- the AAC-ELD codec may be configured to generate two streams or channels of stereo audio signals using the same sampling rate of 24 KHz as the mono HFP.
- the 24 KHz stereo AAC-ELD configuration may generate audio frames with a frame size (also referred to as block size) of 180 samples and a frame duration of 7.5 ms for each channel.
- the audio frames of the two channels may be packetized into ELD packets of different sizes as a function of the desired audio quality.
- two ELD packets within a current 15 ms interval each ELD packet packetized from the 180-sample audio frames of each of the two channels, may be bundled with two ELD packets from a previous 15 ms interval to constitute the 2-EV5 transport packet of 360 bytes every 15 ms.
- the two ELD packets of the previous 15 ms interval are considered redundant or forward error correction (FEC) packets that may be used by the decoder to recover up to 100% of single packet loss.
- FEC forward error correction
- the two new ELD packets of the current 15 ms interval yields a maximum effective data rate of 96 kbps.
- two ELD packets within a current 15 ms interval of the 2-EV5 transport packet each ELD packet packetized from the 180-sample audio frames of each of the two channels, may be bundled with one of the two ELD packet from a previous 15 ms interval to constitute the 360 bytes of the 2-EV5 transport packet every 15 ms.
- the ELD packets for the high quality stereo WB audio may be larger than the ELD packets for the medium quality stereo WB audio to provide higher data rate.
- the high quality stereo WB audio may have a maximum effective data rate of 128 kbps.
- the redundant or FEC packet from the previous 15 ms interval may allow recovery of up to 50% of single packet loss.
- the high quality stereo WB audio thus achieves higher audio quality but at a cost of reduced robustness against packet loss when compared to the medium quality stereo WB audio.
- the audio frame of 7.5 ms supporting a maximum block size of 180 samples may be expanded to support the higher sampling rate and the resulting bigger block size needed for sampling and encoding the wider audio bandwidth of the 16 KHz SWB or the 24 KHz FB audio.
- the AAC-ELD decoder of the Bluetooth audio link may support a maximum block size of 480 samples.
- the AAC-ELD codec may be configured with a sampling rate of 32 KHz to generate stereo audio samples.
- the 32 KHz stereo AAC-ELD configuration may generate audio frames with a block size of 480 samples and a frame duration of 15 ms for each channel.
- the audio frames of the two channels may be packetized into ELD packets of different sizes as a function of the desired audio quality.
- one ELD packet packetized from the 480-sample audio frame of each channel for a current 15 ms interval may be bundled with one ELD packet from a previous 15 ms interval to constitute the 2-EV5 transport packet of 360 bytes every 15 ms.
- the ELD packet of the previous 15 ms interval is the redundant or FEC packet that may be used by the decoder to recover up to 100% of single packet loss.
- the ELD packet of the current 15 ms interval yields a maximum effective data rate of 96 kbps.
- one ELD packet packetized from the 480-sample audio frame of each of the two channels for a current 15 ms interval may be bundled with a smaller ELD packet packetized from the audio frame of a previous 15 ms interval to constitute the 360-byte 2-EV5 transport packet every 15 ms.
- the ELD packets for the high quality stereo SWB audio may be larger than the ELD packets for the medium quality stereo SWB audio to provide higher data rate.
- the high quality stereo SWB audio may have a maximum effective data rate of 128 kbps.
- the ELD packet of the redundant or FEC packet from the previous 15 ms interval may allow recovery of up to 100% of single packet loss, albeit at a reduced audio quality due to the smaller ELD packet size.
- the high quality stereo SWB audio thus achieves higher audio quality but at a cost of reduced robustness against packet loss when compared to the medium quality stereo SWB audio.
- the AAC-ELD codec may be configured with a sampling rate of 48 KHz to generate stereo audio signals.
- the 48 KHz stereo AAC-ELD configuration may generate audio frames with a block size of 480 samples and a frame duration of 10 ms for each channel.
- the audio frames of the two channels may be packetized into ELD packets.
- One ELD packet for a current 10 ms may be bundled or concatenated with one half of another ELD packet packetized from a previous or a next 10-ms audio frame to constitute the 2-EV5 transport packet of 360 bytes every 15 ms.
- Three ELD packets are thus fragmented into two Bluetooth transport packets.
- the 360 bytes from the 11 ⁇ 2 new ELD packets every 15 ms yields a maximum effective data rate of 192 kpbs. Because the new ELD packets fully occupy each 2-EV5 transport packet, no redundant or FEC packets are available for recovery from packet loss.
- the AAC-ELD decoder of the Bluetooth audio link supports a maximum block size of 360 samples
- packet fragmentation associated with the 48 KHz stereo AAC-ELD configuration may be eliminated.
- the AAC-ELD codec may be configured with a sampling rate of 48 KHz to generate audio frames with a block size of 360 samples and a frame duration of 7.5 ms for each channel.
- the audio frames of the two channels may be packetized into ELD packets.
- Two ELD packets in a 15 ms interval may be bundled or concatenated to constitute the 2-EV5 transport packet of 360 bytes every 15 ms, yielding a maximum effective data rate of 192 kbps.
- aspects of the disclosure configure the AAC-ELD codec to support bi-directional stereo operation using the same sampling rate and block size of audio frames as mono HFP by leveraging the larger packet size of the 2-EV5 transport packets.
- Both the downlink audio from the local source device to the receiving device and the uplink audio from the receiving device to the local source device may use the same sampling frequency of 24 KHz to sample and encode the 12 KHz bandwidth of WB audio in two channels.
- a packetized audio frame is smaller than the 2-EV5 transport packet and several packetized audio frames may be bundled to constitute the 2-EV5 transport packet, allowing decoupling of the audio payload of the ELD packets and the Bluetooth transport packets.
- the larger 2-EV5 packet size may allow the transport of redundant or FEC packets to allow the receiving device to recover from packet loss.
- aspects of the disclosure also configure the AAC-ELD codec to support bi-directional stereo operation for the higher bandwidth of the SWB or FB audio by expanding the block size of the audio frames generated by the AAC-ELD codec from 180 samples to 480 samples.
- the AAC-ELD codec may be configured to sample and encode SWB or FB audio in two channels by increasing the sampling rate to 32 KHz and 48 KHz, respectively.
- the downlink and uplink audio may use different sampling frequencies to support audio of different qualities or bandwidths.
- the support for 48 KHz stereo audio in downlink enables audio quality similar to or exceeding that of uni-directional wireless audio connection such as the Advanced Audio Distribution Profile (A2DP).
- the AAC-ELD codec may be configured for dynamic bit rate switching to enable trade-offs between audio quality and robustness against packet loss.
- a method of streaming stereo audio signals over a Bluetooth link from a source device to a receiving device includes the source device initializing audio parameters to configure a stereo communication profile of the Bluetooth link.
- the audio parameters may include a configured sampling rate of a codec, a configured block size of encoded audio frames, and a configured audio quality level.
- the method also includes the source device receiving stereophonic audio signals that includes audio signals for two channels.
- the method further includes the source device encoding the audio signals in the two channels based on the configured sampling rate into the encoded audio frames of the configured block size for each of the two channels.
- the method further includes the source device processing the encoded audio frames of the two channels into transport packets based on the configured audio quality level.
- the method further includes the source device transmitting the transport packets over the Bluetooth link to the receiving device.
- FIG. 1 depicts a scenario of a user wearing a wireless headset that is communicatively coupled to a smartphone over a Bluetooth audio link for the user to engage in a voice call according to one aspect of the disclosure.
- FIG. 2 A depicts monophonic audio content encoded in packetized audio frames that are transmitted in Bluetooth transport packets using mono Hands-Free Profile (HFP) so that the same audio frames are heard by both the left and the right buds of a wireless headset.
- HFP mono Hands-Free Profile
- FIG. 2 B depicts stereo audio content encoded in packetized audio frames for two channels and transmitted in Bluetooth transport packets using stereo HFP so that the audio frames for the two channels are separately heard by the left bud and the right bud of a wireless headset according to one aspect of the disclosure.
- FIG. 3 is a block diagram of processing modules of a source device configured to encode and packetize stereo audio signals into transport packets for downlink transmission to a wireless headset, and to disassemble and decode uplink transport packets received from the wireless headset using stereo HFP according to one aspect of the disclosure.
- FIG. 4 is a call flow diagram of interactions between hardware and software components of a source device configured to encode and packetize stereo audio signals into transport packets for transmission over the Bluetooth audio link using stereo HFP according to one aspect of the disclosure.
- FIG. 5 shows the possible stereo AAC-ELD configurations for encoding and transporting ELD packets of audio frames using the 360-byte 2-EV5 transport packets with a packet duration of 15 ms in stereo HFP according to one aspect of the disclosure.
- FIG. 6 shows the bundling of the constituent ELD packets in 2-EV5 transport packets when transitioning from high quality to medium quality for 24 KHz stereo AAC-ELD configuration according to one aspect of the disclosure.
- FIG. 7 is a flow diagram of a method for streaming stereo audio signals over a Bluetooth link from a source device to a receiving device using stereo HFP according to one aspect of the disclosure.
- bi-directional voice communication When communicating audio content such as speech signals over a Bluetooth link, it is desirable to enhance the audio quality by enabling bi-directional audio signal transmission in stereo format carried on two or more channels.
- a source device such as a smartphone
- a sink device such as a wireless headset over the Bluetooth link
- bi-directional voice communication has been limited to a single audio channel due to processing latency considerations.
- many applications may benefit from bi-directional stereo voice communication.
- telephony or videoconferencing applications may encode speech signals received from multiple microphones for transport on the left and right channels to enable the rendering of spatial audio.
- the applications may also stream music in stereo format during a conversation session.
- binaural stereo signals may be transmitted to a wireless headset to provide a more immersive listening experience.
- the maximum data throughput may be tripled from 64 kbps of the mono HFP protocol to 192 kbps using a stereo HFP protocol.
- the increased throughput not only supports stereo operations, enabling AD2P-like audio quality (48 KHz) for downlink, but allows the transport of redundant or FEC packets for increased robustness against packet loss.
- the AAC-ELD codec may be configured for dynamic bit rate switching to flexibly perform trade-offs between audio quality and robustness against packet loss.
- spatially relative terms such as “beneath”, “below”, “lower”, “above”, “upper”, and the like may be used herein for ease of description to describe one element's or feature's relationship to another element(s) or feature(s) as illustrated in the figures. It will be understood that the spatially relative terms are intended to encompass different orientations of the device in use or operation in addition to the orientation depicted in the figures. For example, if the device in the figures is turned over, elements described as “below” or “beneath” other elements or features would then be oriented “above” the other elements or features. Thus, the exemplary term “below” can encompass both an orientation of above and below. The device may be otherwise oriented (e.g., rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein interpreted accordingly.
- FIG. 1 depicts a scenario of a user wearing a wireless headset 113 that is communicatively coupled to a smartphone 111 over a Bluetooth audio link 115 for the user to engage in a voice call according to one aspect of the disclosure.
- the smartphone 111 may run a telephony or videoconferencing application to allow the user to engage in a conversation with a remote user.
- the smartphone 111 may stream media content such as music to the user over the Bluetooth audio link 115 to allow the user to listen to the music through the wireless headset 113 while participating in a conversation.
- the Bluetooth audio link 115 may be configured in a communication protocol such as stereo HFP used for conducting hands-free calls or the Headset Profile (HSP) to allow the smartphone 111 and the headset 113 to exchange audio content in stereo format using two audio channels as will be discussed.
- the smartphone 111 may receive downlink audio signals representing the speech of the remote user in the telephony or videoconferencing call from a remote device (not shown) via a network (e.g., the Internet).
- the remote device may capture the downlink audio signals in more than one audio channel using an array of microphones.
- the downlink audio signals may be mixed with streaming stereo music generated locally by the smartphone 111 or received through the network.
- the smartphone 111 may encode the downlink audio signals or the mixed audio content using the AAC-ELD codec to generate audio frames.
- the smartphone 111 may packetize the audio frames into ELD packets and bundle the ELD packets into 2-EV5 transport packets for downlink transmission over the Bluetooth audio link 115 .
- the ELD packets may contain the encoded audio data in two channels, one for the left ear and one for the right ear of the wireless headset 113 .
- the wireless headset 113 may receive the 2-EV5 transport packets of the downlink transmission, disassemble the 2-EV5 transport packets into its constituent ELD packets, unpack the ELD packets of the two channels into audio frames, and decode the audio frames using AAC-ELD decoder into the left and right channel for the left ear and the right ear, respectively, to enable the rendering of spatial audio to the user.
- the wireless headset 113 may contain an array of microphones to capture the near-field speech signals of the user in more than one channel.
- the wireless headset 113 may encode the near-field speech signals using the AAC-ELD codec to generate audio frames, packetize the audio frames into ELD packets, and bundle the ELD packets into 2-EV5 transport packets for uplink transmission over the Bluetooth audio link 115 .
- the ELD packets may contain the encoded near-field speech signals in two channels.
- the smartphone 111 may receive the 2-EV5 transport packets of the uplink transmission, disassemble the 2-EV5 transport packets into its constituent ELD packets, and unpack the ELD packets of the two channels into audio frames for the two channels.
- the smartphone 111 may further process the audio frames for transmission to the remote device through the network.
- FIG. 2 A depicts monophonic audio content encoded in packetized audio frames that are transmitted in Bluetooth transport packets using mono Hands-Free Profile (HFP) so that the same audio frames are heard by both the left and the right buds of a wireless headset.
- a source device such as the smartphone 111 may use modified sub-band codec (mSBC) with a sampling rate of 16 KHz or AAC-ELD codec with a higher sampling rate of 24 KHz to encode a mono stream of audio data.
- the encoded audio frames may have a duration of 7.5 ms and a block size of 180 samples.
- the source device may packetize the audio frames into ELD packets and assemble the ELD packet into transport packets such as the Bluetooth Classic Enhanced Synchronous Connection-Oriented (eSCO) 2-EV3 packets for transmission to a sink device.
- eSCO Bluetooth Classic Enhanced Synchronous Connection-Oriented
- the 2-EV3 packet encapsulating one ELD packet of encoded mono audio may have a maximum packet size of 60 bytes and a packet duration or duty cycle of 7.5 ms.
- the mono HFP protocol thus may support two 2-EV3 packets or 120 bytes every 15 ms, yielding a maximum bit rate of 64 kbps.
- a sink device such as the wireless headset 113 may process the 2-EV3 packets to decode audio data of the mono stream.
- the decoded audio data of the mono stream is streamed to both the left and the right earphones or earbuds of the wireless headset 113 .
- the 24 KHz sampling rate of the AAC-ELD codec and the maximum 64 kpbs throughput of the mono HFP may support 12 KHz wide-band (WB) mono audio in both uplink and downlink directions.
- WB wide-band
- FIG. 2 B depicts stereo audio content encoded in packetized audio frames for two channels and transmitted in Bluetooth transport packets using stereo HFP so that the audio frames for the two channels are separately heard by the left bud and the right bud of a wireless headset according to one aspect of the disclosure.
- the Bluetooth transport packets use the larger eSCO 2-EV5 packets having a maximum packet size of 360 bytes and a packet duration or duty cycle of 15 ms to enable transmissions of the two channels of the stereo audio simultaneously.
- a source device may configure the AAC-ELD codec in the stereo HFP to use the same 24 KHz sampling rate and to generate encoded audio frames having the same 180-sample block size as the mono HFP.
- the source device may encode the two streams of the stereo audio into separate audio frames.
- the encoded audio frames for each channel may have a frame duration of 7.5 ms and a block size of 180 samples.
- the source device may packetize the encoded audio frames for the two channels into ELD packets every 7.5 ms.
- the source device may bundle two ELD packets within a 15 ms interval into a 2-EV5 transport packet for transmission to a sink device, yielding a bit rate of 128 kpbs.
- a sink device such as the wireless headset 113 may process the 2-EV5 packets to decode audio data of the two channels.
- the decoded audio data for each channel is provided to either the left or the right earbuds of the wireless headset 113 to render stereo audio.
- the 128 kpbs throughput of the stereo HFP doubles the throughput of the mono HFP to support WB stereo audio in both uplink and downlink directions.
- the source device may configure the AAC-ELD device to use 32 KHz or 48 KHz sampling rate to generate encoded audio frames for each channel with a block size of 480 samples and a frame duration of 15 ms or 10 ms, respectively.
- the source device may packetize the encoded audio frames for the two channels into ELD packets for bundling into the 2-EV5 transport packets.
- the source device may bundle the two ELD packets, each packetized from the 180-sample encoded audio frames of each channel for a current 15 ms interval, with one of the two ELD packet transmitted in a previous 15 ms interval to generate a 2-EV5 transport packet of 360 bytes.
- the ELD packet from the previous 15 ms is considered a redundant or FEC packet that may be used by the sink device to recover the ELD packet if it was not received in the previous 15 ms.
- the redundant or FEC packet packetizes the encoded audio frame for the two channels to allow recovery of audio data for both channels.
- the rate of single packet loss may be 50%.
- the source device may perform a trade-off between audio quality and robustness against packet loss to configure the stereo HFP to enable a 100% recovery rate from single packet loss at a cost of a reduction in audio quality.
- FIG. 3 is a block diagram of processing modules of a source device 301 configured to encode and packetize stereo audio signals into transport packets for downlink transmission to a Bluetooth headphone 309 , and to disassemble and decode uplink transport packets received from the Bluetooth headphone 309 using stereo HFP according to one aspect of the disclosure.
- the source device 301 may be the smartphone 111 that runs an Internet Protocol (IP or Voice over IP (VoIP)) telephony or videoconferencing application to allow a user wearing the Bluetooth headphone 309 to engage in conversation with a remote user.
- IP Internet Protocol
- VoIP Voice over IP
- An audio processing module 303 may receive audio signals in two channels representing the speech of the remote user, also referred to as the far-field speech signals, from a remote device via a network (e.g., the Internet).
- the audio signals may be encoded (e.g., MP3, AAC, etc.) and encapsulated in IP packets.
- the audio processing module 303 may disassemble the IP packets and decode the audio signal.
- the decoded two-channel far-field speech signals may be mixed with streaming music generated locally by the source device 301 or received through the network.
- the audio processing module 303 may output the decoded two-channel far-field speech signals or the mixed signals as downlink stereo audio signals 302 .
- a Bluetooth stereo HFP processing module 305 may encode the downlink stereo audio signals 302 using the AAC-ELD codec to generate audio frames of downlink encoded stereo audio signals 304 for each channel.
- AAC-ELD codecs are chosen for their good audio quality and low processing latency.
- the source device 301 may configure the AAC-ELD codec to have a sampling rate of 24, 32, or 48 KHz for sampling 12-KHz WB, 16-KHz SWB, or 24-KHz FB audio, respectively.
- the source device 301 may also configure the AAC-ELD codec to generate audio frames of a selected block size and frame duration for each of the two channels as a function of the bandwidth of the audio signals.
- the block size may be selected to match the block size supported by the AAC-ELD decoder of the Bluetooth headphone 309 .
- the audio frames for each channel may have a block size of 180 samples with a frame duration of 7.5 ms when sampling WB audio at 24 KHz, 480 samples with a frame duration of 15 ms when sampling SWB audio at 32 KHz, 480 samples with a frame duration of 10 ms when sampling FB audio at 48 KHz but would require packet fragmentation when bundling ELD packets into Bluetooth transport packets, or 360 samples with a frame duration of 7.5 ms when sampling the SWB audio at 48 KHz with no packet fragmentation, etc.
- a Bluetooth audio link processing module 307 may packetize downlink encoded stereo audio signals 304 for the two channels into ELD packets.
- the ELD packets may be configured to have a packet size as a function of the desired audio quality and the bandwidth of the audio signals.
- the Bluetooth audio link processing module 307 may bundle or concatenate the ELD packets into the Bluetooth 2-EV5 transport packets having a maximum packet size of 360 bytes and a packet duration of 15 ms.
- the Bluetooth audio link processing module 307 may bundle into the 2-EV5 transport packets redundant or FEC packets used for packet loss recovery.
- the source device 301 may transmit the 2-EV5 transport packets containing packetized audio frames of the two channels of the downlink audio as the downlink component of the Bluetooth audio link packets 306 over the Bluetooth audio link to the Bluetooth headphone 309 .
- the Bluetooth headphone 309 may receive the downlink 2-EV5 transport packets, process the 2-EV5 transport packets to recover the ELD packets, unpack the ELD packets into audio frames for the two channels, and decode the audio frames using an AAC-ELD decoder into separate channels carrying stereo audio signals for the left ear and the right ear of the user. In one aspect, if redundant or FEC packets are available, the Bluetooth headphone 309 may process the redundant or FEC packets to recover an ELD packet that was lost from a previous downlink 2-EV5 transport packet due to interference or degraded channel condition.
- the Bluetooth headphone 309 may capture the near-field speech signals of the user in two channels. Similar to the source device 301 , the Bluetooth headphone 309 may encode the near-field speech signals using an AAC-ELD coder to generate audio frames of encoded stereo audio signals for each channel as a function of the bandwidth of the audio signals. The Bluetooth headphone 309 may packetize the encoded stereo audio signals for the two channels into ELD packets and bundle or concatenate the ELD packets into the Bluetooth 2-EV5 transport packets as a function of the desired audio quality and the bandwidth of the audio signals.
- the Bluetooth headphone 309 may transmit the 2-EV5 transport packets containing packetized audio frames of the two channels of the uplink audio as the uplink component of the Bluetooth audio link packets 306 over the Bluetooth audio link to the source device 301 .
- the downlink and uplink audio may be sampled at different sampling rates to generate encoded stereo signals of different audio qualities in the two directions.
- the downlink audio may be sampled at a higher sampling rate and allocated with more transmission bandwidth than the uplink audio to enable SWB or FB audio in the downlink only.
- the Bluetooth audio link processing module 307 of the source device 301 may process the uplink 2-EV5 transport packets to recover the ELD packets containing the packetized audio frames of the two channels of the uplink audio.
- the Bluetooth audio link processing module 307 may unpack the packetized audio frames to recover the audio frames of uplink encoded stereo audio signals 308 for each channel.
- the Bluetooth audio link processing module 307 may process the redundant or FEC packets to recover an ELD packet that was lost from a previous uplink 2-EV5 transport packet due to interference or degraded channel condition.
- the Bluetooth stereo HFP processing module 305 may decode the audio frames of the uplink encoded stereo audio signals 308 for each channel using the AAC-ELD codec to generate the uplink stereo audio signal 310 for the two channels representing the near-field speech signals of the user.
- the audio processing module 303 may process the uplink stereo audio signal 310 for uplink transmission to the remote device through the network.
- the audio processing module 303 may encode (e.g., MP3, AAC, etc.) the uplink stereo audio signal 310 and encapsulate the encoded audio in IP packets.
- the audio processing module 310 may encapsulate the audio frames of the uplink encoded stereo audio signals 308 for each channel into IP packets for uplink transmission without first decoding using the AAC-ELD codec.
- FIG. 4 is a call flow diagram of interactions between hardware and software components of a source device configured to encode and packetize stereo audio signals into transport packets for transmission over the Bluetooth audio link using stereo HFP according to one aspect of the disclosure.
- the source device may include the application layer software 401 , the Bluetooth driver 403 , and the Bluetooth modem 405 .
- a processor of the source device may execute the application layer software 401 to run a telephony or videoconferencing application.
- the application layer software 401 may implement the audio processing module 303 of FIG. 3 to allow a user to engage in a conversation with a remote user via a network.
- a processor may execute the Bluetooth driver 403 to configure the Bluetooth modem 405 to implement the stereo HFP protocol to exchange audio data with the Bluetooth headset 407 .
- the Bluetooth driver 403 may implement the AAC-ELD codec of the Bluetooth stereo HFP processing module 305 of FIG. 3 .
- the Bluetooth modem 405 may implement the packet processing of the ELD packets and the 2-EV5 transport packets of the Bluetooth audio link processing module 307 of FIG. 3 .
- the Bluetooth modem 405 determines if it has the capability to support the stereo HFP protocol, such as the larger packet size and longer duty cycle of the 2-EV5 transport packet and the larger block size of the audio frames generated by the AAC-ELD codec to enable bi-directional stereo operation over a Bluetooth link.
- the Bluetooth modem 405 may initialize the profile information of the HFP to indicate that it supports stereo HFP.
- the Bluetooth headset 407 may transmit the capability information of its codec to the source device, such as whether it supports the AAC-ELD codec, the supported sampling rate of the AAC-ELD codec, the maximum block size of the decoder of the AAC-ELD codec, etc.
- the Bluetooth modem 405 may determine if the Bluetooth headset 407 supports stereo HFP and may publish capability information of the Bluetooth headset 407 and the profile information of the HFP supported by the Bluetooth modem 405 to the Bluetooth driver 403 .
- the Bluetooth driver 403 updates protocol information of the stereo HFP supported by the Bluetooth link between the source device and the Bluetooth headset 407 such as the supported audio parameters of the AAC-ELD codec based on the information received from the Bluetooth modem 405 .
- the application layer software 401 runs a telephony or videoconferencing application to establish a connection with a remote device to exchange audio data with the remote device through the network.
- the application layer software 401 may determine the desired performance parameters such as the desired audio quality level and the desired bandwidth of the audio data.
- the application layer software 401 may transmit a start signal with the desired performance parameters to the Bluetooth driver 403 to enable the stereo HFP.
- the Bluetooth driver 403 configures the AAC-ELD codec based on the desired performance parameters.
- the Bluetooth driver 403 may configure the AAC-ELD codec with a sampling rate of 24, 32, or 48 KHz to sample 12-KHz WB, 16-KHz SWB, or 24-KHz FB audio, respectively.
- the Bluetooth driver 403 may configure the AAC-ELD codec to generate audio frames of a selected block size and frame duration based on the desired bandwidth.
- the Bluetooth driver 403 may transmit stereo HFP configuration information to the Bluetooth modem 405 based on the desired audio quality level and the desired bandwidth.
- the Bluetooth modem 405 may configure the processing of the ELD packets and the 2-EV5 transport packets based on the stereo HFP configuration information received from the Bluetooth driver 403 .
- the Bluetooth modem 405 may configure the packet size of the ELD packets based on the desired audio quality level and the desired bandwidth.
- the Bluetooth modem 405 may configure the 2-EV5 transport packets to include redundant or FEC packets based on a trade-off between the desired audio quality and robustness against packet loss.
- the application layer software 401 receives encoded stereo audio signals representing the far-field speech of the remote user in the telephony or videoconferencing application.
- the application layer software 401 may decode the far-field speech signals to stream downlink stereo signals to the Bluetooth driver 403 .
- the Bluetooth driver 403 encodes the downlink stereo audio signals using the configured AAC-ELD codec to generate audio frames of downlink encoded stereo audio signals for each of the two channels.
- the Bluetooth driver 403 may transmit the audio frames to the Bluetooth modem 405 .
- the Bluetooth modem 405 packetizes the audio frames of downlink encoded stereo audio signals for the two channels into ELD packets based on its configuration from operation 421 .
- the Bluetooth modem 405 bundles or concatenate the ELD packets into the Bluetooth 2-EV5 transport packets.
- the Bluetooth modem 405 may bundle into the 2-EV5 transport packets redundant or FEC packets used for packet loss recovery.
- the Bluetooth modem 405 may transmit the 2-EV5 transport packets containing the packetized audio frames of the downlink encoded stereo audio signals over the Bluetooth link to the Bluetooth headset 407 .
- FIG. 5 shows the possible stereo AAC-ELD configurations for encoding and transporting ELD packets of audio frames using the 360-byte 2-EV5 transport packets with a packet duration of 15 ms in stereo HFP according to one aspect of the disclosure.
- the AAC-ELD codec may be configured to generate the two channels of stereo audio signals using the sampling rate of 24 KHz.
- the 24 KHz stereo AAC-ELD configuration may generate audio frames of a nominal block size.
- the nominal block size may have 180 samples and a frame duration of 7.5 ms for each channel.
- the audio frames of the two channels may be packetized into ELD packets of different sizes as a function of the desired audio quality.
- two ELD packets within a current 15 ms interval of the 2-EV5 transport packet may be bundled with two ELD packets from a previous 15 ms interval to constitute the 360 bytes of the 2-EV5 transport packet every 15 ms.
- the two ELD packets of the previous 15 ms interval are considered redundant or forward error correction (FEC) packets that may be used by the decoder to recover up to 100% of single packet loss.
- FEC forward error correction
- the maximum data rate may be 96 kbps.
- two ELD packets within a current 15 ms interval of the 2-EV5 transport packet may be bundled with one of the two ELD packets from a previous 15 ms interval to constitute the 360 bytes of the 2-EV5 transport packet every 15 ms.
- the maximum data rate may be 128 kbps.
- the 120 bytes of the redundant or FEC packet from the previous 15 ms interval may allow recovery of up to 50% of single packet loss.
- the high quality stereo WB audio achieves higher audio quality but at a cost of reduced robustness against packet loss when compared to the medium quality stereo WB audio.
- the AAC-ELD codecs for uplink and downlink may be configured to run at the same 24-KHz sampling rate.
- the AAC-ELD packet transport may be configured for dynamic bit rate switching between the high quality and medium quality levels under the 24 KHz stereo AAC-ELD configuration to flexibly perform trade-offs between audio quality and robustness against packet loss. For example, while running high quality WB audio in stereo HFP, the 24 KHz stereo AAC-ELD configuration may be switched to medium quality WB audio when there is excessive packet loss due to interference or degraded channel condition of the Bluetooth link.
- the audio frame may be expanded to support the higher sampling rate and the resulting bigger block size needed for the wider audio bandwidth of the 16 KHz SWB or the 24 KHz FB audio.
- the AAC-ELD codec may be configured with a sampling rate of 32 KHz to generate stereo audio samples of an expanded block size.
- the expanded block size may have 480 samples and a frame duration of 15 ms for each channel.
- the audio frames of the two channels may be packetized into ELD packets of different sizes as a function of the desired audio quality.
- one ELD packet for a current 15 ms interval may be bundled with one ELD packet from a previous 15 ms interval to constitute the 2-EV5 transport packet of 360 bytes every 15 ms.
- the maximum data rate may be 96 kbps.
- the ELD packet of the previous 15 ms interval is the redundant or FEC packet that may be used by the decoder to recover up to 100% of single packet loss.
- one ELD packet for a current 15 ms interval may be bundled with a smaller ELD packet of a previous 15 ms interval to constitute the 360-byte 2-EV5 transport packet every 15 ms.
- the maximum data rate may be 128 kbps.
- the smaller packet from the previous 15 ms interval may allow recovery of up to 100% of single packet loss, albeit at a reduced audio quality due to the smaller ELD packet size.
- the high quality stereo SWB audio achieves higher audio quality but at a cost of reduced robustness against packet loss when compared to the medium quality stereo SWB audio.
- the downlink and uplink may be configured to run at different audio quality levels of the 32 KHz stereo AAC-ELD configuration.
- the 32 KHz stereo AAC-ELD configuration may be enabled for downlink direction only.
- the stereo HFP may configure the AAC-ELD codec for the uplink to run at a lower sampling rate, such as using the 24 KHz stereo AAC-ELD configuration.
- the AAC-ELD packet transport may be configured for dynamic bit rate switching between the high quality and medium quality levels under the 32 KHz stereo AAC-ELD configuration to flexibly perform trade-offs between audio quality and robustness against packet loss.
- the AAC-ELD codec may be configured with a sampling rate of 48 KHz to generate stereo audio frames of the expanded block size.
- the audio frames of the two channels may be packetized into ELD packets.
- Three ELD packets may be fragmented into two 2-EV5 transport packets.
- the maximum data rate may be 192 kpbs. Because the ELD packets fully occupy the 360 bytes of each 2-EV5 transport packet, no redundant or FEC packets are available for use to recover from packet loss.
- the AAC-ELD decoder of a sink device supports a maximum block size of 360 samples
- packet fragmentation associated with the 48 KHz stereo AAC-ELD configuration may be eliminated.
- the 48 KHz stereo AAC-ELD configuration may configure the AAC-ELD codec with a sampling rate of 48 KHz to generate audio frames of an intermediate block size.
- the intermediate block size may have 360 samples and a frame duration of 7.5 ms for each channel.
- the audio frames of the two channels may be packetized into ELD packets.
- Two ELD packets may be bundled or concatenated to constitute the 2-EV5 transport packet of 360 bytes every 15 ms, yielding a maximum data rate of 192 kbps. Again, because the new ELD packets fully occupy the 360 bytes of each 2-EV5 transport packet, no redundant or FEC packets are available for use to recover from packet loss.
- the downlink and uplink may be configured to run at different audio quality levels of the 48 KHz stereo AAC-ELD configuration.
- the 48 KHz stereo AAC-ELD configuration may be enabled for downlink direction only.
- the stereo HFP may configure the AAC-ELD codec for the uplink to run at a lower sampling rate, such as using the 32 KHz or 24 KHz stereo AAC-ELD configuration.
- the support for 48 KHz stereo AAC-ELD configuration in downlink enables audio quality similar to or exceeding that of uni-directional wireless audio connection such as the Advanced Audio Distribution Profile (A2DP).
- A2DP Advanced Audio Distribution Profile
- FIG. 6 shows the bundling of the constituent ELD packets in 2-EV5 transport packets when transitioning from high quality to medium quality for 24 KHz stereo AAC-ELD configuration according to one aspect of the disclosure.
- the AAC-ELD packet transport may be configured to switch between the high quality and medium quality levels under the 24 KHz stereo AAC-ELD configuration to flexibly perform trade-offs between audio quality and robustness against packet loss.
- the AAC-ELD packet transport is initially configured to encode stereo WB at the high quality level.
- the 360 bytes of each 2-EV5 transport packet includes two ELD packets from a current 15 ms interval, the ELD packets generated based on the 24 KHz stereo AAC-ELD configuration at the high quality level, and one of two ELD packets from a previous 15 ms interval that is the redundant packet.
- the redundant packet may be a duplicate of the ELD packet transmitted in the previous 15 ms interval.
- a sink device may use the redundant packet to recover from a packet loss with a 50% recovery rate.
- the stereo HFP may configure the AAC-ELD packet transport to switch from the high quality level to the medium quality level.
- the 360 bytes of the 2-EV5 transport packet may include two ELD packets in the 15 ms of the transition interval, the two ELD packets generated based on the 24 KHz stereo AAC-ELD configuration at the medium quality level, and one of two ELD packet from the previous 15 ms interval that is the redundant packet, the ELD redundant packet generated based on the 24 KHz stereo AAC-ELD configuration at the high quality level.
- the remaining excess payload of the 2-EV5 transport packet may include a padding packet.
- the 360 bytes of the 2-EV5 transport packet may include two ELD packets from a current 15 ms interval, the two ELD packets generated based on the 24 KHz stereo AAC-ELD configuration at the medium quality level, and two ELD packets from a previous 15 ms interval that are the redundant packets.
- FIG. 7 is a flow diagram of a method 700 for streaming stereo audio signals over a Bluetooth link from a source device to a receiving device using stereo HFP according to one aspect of the disclosure.
- the method 700 may be practiced by one or more of the, module, modem, software, and driver of the source device of FIGS. 1 , 2 , 3 and 4 .
- the stereo HFP initializes audio parameters for a stereo configuration profile for a source device of a Bluetooth link.
- the audio parameters include a configured sampling rate of a codec, a configured block size of encoded audio frames, and a configured audio quality level.
- the source device receives stereophonic audio signals.
- the stereophonic signals may include audio signals for two channels.
- the source device encodes the audio signals in each of the two channels based on the configured sampling rate into encoded audio frames of the configured block size for each of the two channels.
- the source device processes the encoded audio frames of the two channels into Bluetooth transport packets based on the configured audio quality level.
- the source device transmits the Bluetooth transport packets over the Bluetooth link to a receiving device.
- aspects of the stereo HFP protocol described herein may be implemented in a data processing system, for example, by a network computer, network server, tablet computer, smartphone, laptop computer, desktop computer, other consumer electronic devices or other data processing systems.
- the operations described for the stereo HFP protocol are digital signal processing operations performed by a processor that is executing instructions stored in one or more memories.
- the processor may read the stored instructions from the memories and execute the instructions to perform the operations described.
- These memories represent examples of machine readable non-transitory storage media that can store or contain computer program instructions which when executed cause a data processing system to perform the one or more methods described herein.
- the processor may be a processor in a local device such as a smartphone, a processor in a remote server, or a distributed processing system of multiple processors in the local device and remote server with their respective memories containing various parts of the instructions needed to perform the operations described.
- any of the processing blocks may be re-ordered, combined or removed, performed in parallel or in serial, as necessary, to achieve the results set forth above.
- the processing blocks associated with implementing the audio processing system may be performed by one or more programmable processors executing one or more computer programs stored on a non-transitory computer readable storage medium to perform the functions of the system. All or part of the audio processing system may be implemented as, special purpose logic circuitry (e.g., an FPGA (field-programmable gate array) and/or an ASIC (application-specific integrated circuit)).
- All or part of the audio system may be implemented using electronic hardware circuitry that include electronic devices such as, for example, at least one of a processor, a memory, a programmable logic device or a logic gate. Further, processes can be implemented in any combination hardware devices and software components.
- this speech or data may include personal information data that uniquely identifies or can be used to identify a specific person.
- personal information data can include demographic data, location-based data, online identifiers, telephone numbers, email addresses, home addresses, data or records relating to a user's health or level of fitness (e.g., vital signs measurements, medication information, exercise information), date of birth, or any other personal information.
- the present disclosure recognizes that the use of such personal information data, in the present technology, can be used to the benefit of users.
- the present disclosure contemplates that those entities responsible for the collection, analysis, disclosure, transfer, storage, or other use of such personal information data will comply with well-established privacy policies and/or privacy practices.
- such entities would be expected to implement and consistently apply privacy practices that are generally recognized as meeting or exceeding industry or governmental requirements for maintaining the privacy of users.
- Such information regarding the use of personal data should be prominent and easily accessible by users, and should be updated as the collection and/or use of data changes.
- personal information from users should be collected for legitimate uses only. Further, such collection/sharing should occur only after receiving the consent of the users or other legitimate basis specified in applicable law. Additionally, such entities should consider taking any needed steps for safeguarding and securing access to such personal information data and ensuring that others with access to the personal information data adhere to their privacy policies and procedures.
- policies and practices should be adapted for the particular types of personal information data being collected and/or accessed and adapted to applicable laws and standards, including jurisdiction-specific considerations that may serve to impose a higher standard. For instance, in the US, collection of or access to certain health data may be governed by federal and/or state laws, such as the Health Insurance Portability and Accountability Act (HIPAA); whereas health data in other countries may be subject to other regulations and policies and should be handled accordingly.
- HIPAA Health Insurance Portability and Accountability Act
- the present disclosure also contemplates aspects in which users selectively block the use of, or access to, personal information data. That is, the present disclosure contemplates that hardware and/or software elements can be provided to prevent or block access to such personal information data.
- personal information data should be managed and handled in a way to minimize risks of unintentional or unauthorized access or use. Risk can be minimized by limiting the collection of data and deleting data once it is no longer needed.
- data de-identification can be used to protect a user's privacy. De-identification may be facilitated, when appropriate, by removing identifiers, controlling the amount or specificity of data stored (e.g., collecting location data at city level rather than at an address level), controlling how data is stored (e.g., aggregating data across users), and/or other methods such as differential privacy.
- the present disclosure broadly covers the transmission of use of personal information data to implement one or more various disclosed aspects
- the present disclosure also contemplates that the various aspects can also be implemented without the need for accessing such personal information data. That is, the various aspects of the present technology are not rendered inoperable due to the lack of all or a portion of such personal information data.
- content can be selected and delivered to users based on aggregated non-personal information data or a bare minimum amount of personal information, such as the content being handled only on the user's device or other non-personal information available to the content delivery services.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Mathematical Physics (AREA)
- Quality & Reliability (AREA)
- Reverberation, Karaoke And Other Acoustics (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
Description
Claims (20)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US17/741,874 US12488803B2 (en) | 2021-06-04 | 2022-05-11 | Method and system for encoding and wirelessly transmitting stereo audio content for audio communication |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202163197001P | 2021-06-04 | 2021-06-04 | |
| US17/741,874 US12488803B2 (en) | 2021-06-04 | 2022-05-11 | Method and system for encoding and wirelessly transmitting stereo audio content for audio communication |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| US20220392460A1 US20220392460A1 (en) | 2022-12-08 |
| US12488803B2 true US12488803B2 (en) | 2025-12-02 |
Family
ID=84240683
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US17/741,874 Active 2043-02-18 US12488803B2 (en) | 2021-06-04 | 2022-05-11 | Method and system for encoding and wirelessly transmitting stereo audio content for audio communication |
Country Status (2)
| Country | Link |
|---|---|
| US (1) | US12488803B2 (en) |
| CN (1) | CN115442339B (en) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US12444407B1 (en) * | 2022-09-26 | 2025-10-14 | Amazon Technologies, Inc. | Privacy mode for multi-device processing |
Citations (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP1701484A1 (en) | 2005-03-08 | 2006-09-13 | Broadcom Corporation | Improved interoperability when content protection is used with an audio stream |
| US20080144645A1 (en) * | 2006-10-31 | 2008-06-19 | Motorola, Inc. | Methods and devices of a queue controller for dual mode bidirectional audio communication |
| US20080279162A1 (en) * | 2007-05-10 | 2008-11-13 | Broadcom Corporation, A California Corporation | Shared processing between wireless interface devices of a host device |
| US7765019B2 (en) | 2005-11-26 | 2010-07-27 | Wolfson Microelectronics Plc | Portable wireless telephony device |
| US20160164942A1 (en) | 2014-12-05 | 2016-06-09 | Facebook, Inc. | Decoupled audio and video codecs |
| US20170372708A1 (en) * | 2016-06-27 | 2017-12-28 | Qualcomm Incorporated | Audio decoding using intermediate sampling rate |
| US20190104423A1 (en) | 2017-09-29 | 2019-04-04 | Apple Inc. | Ultra-low latency audio over bluetooth |
| CN111225102A (en) | 2020-01-17 | 2020-06-02 | 北京塞宾科技有限公司 | Bluetooth audio signal transmission method and device |
| WO2020239985A1 (en) | 2019-05-31 | 2020-12-03 | Tap Sound System | Method for operating a bluetooth device |
| US20210067874A1 (en) * | 2019-08-26 | 2021-03-04 | Bestechnic (Shanghai) Co., Ltd. | Method, device, loudspeaker equipment and wireless headset for playing audio synchronously |
| US20210092579A1 (en) * | 2019-09-12 | 2021-03-25 | Catena Holding B.V. | Wireless peripheral |
| US20210343299A1 (en) * | 2019-01-13 | 2021-11-04 | Huawei Technologies Co., Ltd. | High resolution audio coding |
-
2022
- 2022-05-11 US US17/741,874 patent/US12488803B2/en active Active
- 2022-06-02 CN CN202210623789.0A patent/CN115442339B/en active Active
Patent Citations (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP1701484A1 (en) | 2005-03-08 | 2006-09-13 | Broadcom Corporation | Improved interoperability when content protection is used with an audio stream |
| US7765019B2 (en) | 2005-11-26 | 2010-07-27 | Wolfson Microelectronics Plc | Portable wireless telephony device |
| US20080144645A1 (en) * | 2006-10-31 | 2008-06-19 | Motorola, Inc. | Methods and devices of a queue controller for dual mode bidirectional audio communication |
| US20080279162A1 (en) * | 2007-05-10 | 2008-11-13 | Broadcom Corporation, A California Corporation | Shared processing between wireless interface devices of a host device |
| US20160164942A1 (en) | 2014-12-05 | 2016-06-09 | Facebook, Inc. | Decoupled audio and video codecs |
| US20170372708A1 (en) * | 2016-06-27 | 2017-12-28 | Qualcomm Incorporated | Audio decoding using intermediate sampling rate |
| US20190104423A1 (en) | 2017-09-29 | 2019-04-04 | Apple Inc. | Ultra-low latency audio over bluetooth |
| US20190104424A1 (en) * | 2017-09-29 | 2019-04-04 | Apple Inc. | Ultra-low latency audio over bluetooth |
| US20210343299A1 (en) * | 2019-01-13 | 2021-11-04 | Huawei Technologies Co., Ltd. | High resolution audio coding |
| WO2020239985A1 (en) | 2019-05-31 | 2020-12-03 | Tap Sound System | Method for operating a bluetooth device |
| US20210067874A1 (en) * | 2019-08-26 | 2021-03-04 | Bestechnic (Shanghai) Co., Ltd. | Method, device, loudspeaker equipment and wireless headset for playing audio synchronously |
| US20210092579A1 (en) * | 2019-09-12 | 2021-03-25 | Catena Holding B.V. | Wireless peripheral |
| CN111225102A (en) | 2020-01-17 | 2020-06-02 | 北京塞宾科技有限公司 | Bluetooth audio signal transmission method and device |
Also Published As
| Publication number | Publication date |
|---|---|
| US20220392460A1 (en) | 2022-12-08 |
| CN115442339A (en) | 2022-12-06 |
| CN115442339B (en) | 2025-08-15 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7638329B2 (en) | Method for operating a bluetooth device - Patents.com | |
| JP7426448B2 (en) | Control of connected multimedia devices | |
| US12041523B2 (en) | Bluetooth audio streaming passthrough | |
| US12010496B2 (en) | Method and system for performing audio ducking for headsets | |
| US12387732B2 (en) | Inter-channel phase difference parameter encoding method and apparatus | |
| WO2020152394A1 (en) | Audio representation and associated rendering | |
| WO2021160040A1 (en) | Audio transmission method and electronic device | |
| US12488803B2 (en) | Method and system for encoding and wirelessly transmitting stereo audio content for audio communication | |
| WO2022012554A1 (en) | Multi-channel audio signal encoding method and apparatus | |
| WO2022156556A1 (en) | Bit allocation method and apparatus for audio object | |
| WO2021213128A1 (en) | Audio signal encoding method and apparatus | |
| JP2025534455A (en) | Method, apparatus and medium for encoding and decoding audio bitstreams | |
| WO2021255327A1 (en) | Managing network jitter for multiple audio streams | |
| WO2022012628A1 (en) | Multi-channel audio signal encoding/decoding method and device | |
| Ostan et al. | Dynamic real-time ambisonics order adaptation for immersive networked music performances | |
| Rothbucher et al. | Backwards compatible 3d audio conference server using hrtf synthesis and sip | |
| Ito et al. | A Study on Effect of IP Performance Degradation on Horizontal Sound Localization in a VoIP Phone Service with 3D Sound Effects | |
| TW202315430A (en) | Bluetooth voice communication system and related computer program product for generating stereo voice effect | |
| JP2025535723A (en) | Method, apparatus and medium for decoding audio signals using skippable blocks | |
| JP2025535060A (en) | Method, apparatus and medium for encoding and decoding an audio bitstream and associated echo reference signal |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: APPLE INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KUMAR, AARTI;ALLAMANCHE, ERIC A.;FU, SU;AND OTHERS;REEL/FRAME:059893/0318 Effective date: 20220510 Owner name: APPLE INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNOR'S INTEREST;ASSIGNORS:KUMAR, AARTI;ALLAMANCHE, ERIC A.;FU, SU;AND OTHERS;REEL/FRAME:059893/0318 Effective date: 20220510 |
|
| FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: ALLOWED -- NOTICE OF ALLOWANCE NOT YET MAILED Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: AWAITING TC RESP., ISSUE FEE NOT PAID |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |