US20180146319A1 - Audio Source Spatialization Relative to Orientation Sensor and Output - Google Patents
Audio Source Spatialization Relative to Orientation Sensor and Output Download PDFInfo
- Publication number
- US20180146319A1 US20180146319A1 US15/355,766 US201615355766A US2018146319A1 US 20180146319 A1 US20180146319 A1 US 20180146319A1 US 201615355766 A US201615355766 A US 201615355766A US 2018146319 A1 US2018146319 A1 US 2018146319A1
- Authority
- US
- United States
- Prior art keywords
- audio
- output
- sensor
- source
- orientation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000006870 function Effects 0.000 claims abstract description 30
- 238000012546 transfer Methods 0.000 claims abstract description 23
- 230000033001 locomotion Effects 0.000 claims abstract description 22
- 230000004044 response Effects 0.000 abstract description 18
- 230000005236 sound signal Effects 0.000 abstract description 9
- 230000009466 transformation Effects 0.000 abstract description 2
- 238000012545 processing Methods 0.000 description 22
- 210000003128 head Anatomy 0.000 description 20
- 238000000034 method Methods 0.000 description 16
- 210000005069 ears Anatomy 0.000 description 13
- 238000003491 array Methods 0.000 description 10
- 238000004458 analytical method Methods 0.000 description 8
- 210000003454 tympanic membrane Anatomy 0.000 description 8
- 230000001133 acceleration Effects 0.000 description 7
- 230000004048 modification Effects 0.000 description 7
- 238000012986 modification Methods 0.000 description 7
- 230000003044 adaptive effect Effects 0.000 description 6
- 230000008859 change Effects 0.000 description 6
- 238000013461 design Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 238000005259 measurement Methods 0.000 description 6
- 230000035945 sensitivity Effects 0.000 description 6
- 238000004891 communication Methods 0.000 description 5
- 230000003190 augmentative effect Effects 0.000 description 4
- 238000001914 filtration Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000001228 spectrum Methods 0.000 description 4
- 241000282412 Homo Species 0.000 description 3
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 230000001934 delay Effects 0.000 description 3
- 238000006073 displacement reaction Methods 0.000 description 3
- 210000000613 ear canal Anatomy 0.000 description 3
- 210000000883 ear external Anatomy 0.000 description 3
- 230000004807 localization Effects 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 229910052710 silicon Inorganic materials 0.000 description 3
- 239000010703 silicon Substances 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 210000004556 brain Anatomy 0.000 description 2
- 239000002775 capsule Substances 0.000 description 2
- 230000001066 destructive effect Effects 0.000 description 2
- 230000009977 dual effect Effects 0.000 description 2
- 210000003027 ear inner Anatomy 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000002955 isolation Methods 0.000 description 2
- 239000012528 membrane Substances 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000005855 radiation Effects 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 230000026683 transduction Effects 0.000 description 2
- 238000010361 transduction Methods 0.000 description 2
- 230000005355 Hall effect Effects 0.000 description 1
- 241001310793 Podium Species 0.000 description 1
- 238000010521 absorption reaction Methods 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 210000003484 anatomy Anatomy 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000001427 coherent effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 210000001508 eye Anatomy 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 230000004886 head movement Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000004377 microelectronic Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 230000010363 phase shift Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000035939 shock Effects 0.000 description 1
- 230000008054 signal transmission Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
- H04S7/302—Electronic adaptation of stereophonic sound system to listener position or orientation
- H04S7/303—Tracking of listener position or orientation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R5/00—Stereophonic arrangements
- H04R5/027—Spatial or constructional arrangements of microphones, e.g. in dummy heads
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/20—Arrangements for obtaining desired frequency or directional characteristics
- H04R1/32—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
- H04R1/40—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
- H04R1/406—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers microphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2201/00—Details of transducers, loudspeakers or microphones covered by H04R1/00 but not provided for in any of its subgroups
- H04R2201/40—Details of arrangements for obtaining desired directional characteristic by combining a number of identical transducers covered by H04R1/40 but not provided for in any of its subgroups
- H04R2201/401—2D or 3D arrays of transducers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2430/00—Signal processing covered by H04R, not provided for in its groups
- H04R2430/20—Processing of the output signals of the acoustic transducers of an array for obtaining a desired directivity characteristic
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R5/00—Stereophonic arrangements
- H04R5/033—Headphones for stereophonic communication
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/15—Aspects of sound capture and related signal processing for recording or reproduction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/01—Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
- H04S7/302—Electronic adaptation of stereophonic sound system to listener position or orientation
- H04S7/303—Tracking of listener position or orientation
- H04S7/304—For headphones
Definitions
- This invention relates to an audio processing system and more particularly to an audio processing system that spatializes audio for output.
- WO 2016/090342 A2 published Jun. 9, 2016, the disclosure of which is expressly incorporated herein and which was made by the inventor of subject matter described herein, shows an adaptive audio spatialization system having an audio sensor array rigidly mounted to a personal speaker.
- a personal speaker(s) such as headphones or earphones.
- Headphones are a pair of small speakers that are designed to be held in place close to a user's ears. They may be electroacoustic transducers which convert an electrical signal to a corresponding sound in the user's ear. Headphones are designed to allow a single user to listen to an audio source privately, in contrast to a loudspeaker which emits sound into the open air, allowing anyone nearby to listen. Earbuds or earphones are in-ear versions of headphones.
- a sensitive transducer element of a microphone is called its element or capsule. Except in thermophone based microphones, sound is first converted to mechanical motion [by] a diaphragm, the motion of which is then converted to an electrical signal.
- a complete microphone also includes a housing, some means of bringing the signal from the element to other equipment, and often an electronic circuit to adapt the output of the capsule to the equipment being driven.
- a wireless microphone contains a radio transmitter.
- the MEMS (MicroElectrical-Mechanical System) microphone is also called a microphone chip or silicon microphone.
- a pressure-sensitive diaphragm is etched directly into a silicon wafer by MEMS processing techniques, and is usually accompanied with integrated preamplifier.
- MEMS microphones are variants of the condenser microphone design.
- Digital MEMS microphones have built in analog-to-digital converter (ADC) circuits on the same CMOS chip making the chip a digital microphone and so more readily integrated with modern digital products.
- ADC analog-to-digital converter
- MEMS silicon microphones Major manufacturers producing MEMS silicon microphones are Wolfson Microelectronics (WM7xxx), Analog Devices, Akustica (AKU200x), Infineon (SMM310 product), Knowles Electronics, Memstech (MSMx), NXP Semiconductors, Sonion MEMS, Vesper, AAC Acoustic Technologies, and Omron.
- a microphone's directionality or polar pattern indicates how sensitive it is to sounds arriving at different angles about its central axis.
- the polar pattern represents the locus of points that produce the same signal level output in the microphone if a given sound pressure level (SPL) is generated from that point.
- SPL sound pressure level
- How the physical body of the microphone is oriented relative to the diagrams depends on the microphone design. Large-membrane microphones are often known as “side fire” or “side address” on the basis of the sideward orientation of their directionality. Small diaphragm microphones are commonly known as “end fire” or “top/end address” on the basis of the orientation of their directionality.
- Some microphone designs combine several principles in creating the desired polar pattern. This ranges from shielding (meaning diffraction/dissipation/absorption) by the housing itself to electronically combining dual membranes.
- An omni-directional (or non-directional) microphone's response is generally considered to be a perfect sphere in three dimensions. In the real world, this is not the case.
- the polar pattern for an “omni-directional” microphone is a function of frequency.
- the body of the microphone is not infinitely small and, as a consequence, it tends to get in its own way with respect to sounds arriving from the rear, causing a slight flattening of the polar response. This flattening increases as the diameter of the microphone (assuming it's cylindrical) reaches the wavelength of the frequency in question.
- a unidirectional microphone is sensitive to sounds from only one direction
- a noise-canceling microphone is a highly directional design intended for noisy environments.
- One such use is in aircraft cockpits where they are normally installed as boom microphones on headsets.
- Another use is in live event support on loud concert stages for vocalists involved with live performances.
- Many noise-canceling microphones combine signals received from two diaphragms that are in opposite electrical polarity or are processed electronically.
- the main diaphragm is mounted closest to the intended source and the second is positioned farther away from the source so that it can pick up environmental sounds to be subtracted from the main diaphragm's signal. After the two signals have been combined, sounds other than the intended source are greatly reduced, substantially increasing intelligibility.
- Other noise-canceling designs use one diaphragm that is affected by ports open to the sides and rear of the microphone.
- Sensitivity indicates how well the microphone converts acoustic pressure to output voltage.
- a high sensitivity microphone creates more voltage and so needs less amplification at the mixer or recording device. This is a practical concern but is not directly an indication of the microphone's quality, and in fact the term sensitivity is something of a misnomer, “transduction gain” being perhaps more meaningful, (or just “output level”) because true sensitivity is generally set by the noise floor, and too much “sensitivity” in terms of output level compromises the clipping level.
- a microphone array is any number of microphones operating in tandem. Microphone arrays may be used in systems for extracting voice input from ambient noise (notably telephones, speech recognition systems, and hearing aids), surround sound and related technologies, binaural recording, locating objects by sound: acoustic source localization, e.g., military use to locate the source(s) of artillery fire, aircraft location and tracking.
- ambient noise notably telephones, speech recognition systems, and hearing aids
- surround sound and related technologies binaural recording
- binaural recording binaural recording
- locating objects by sound acoustic source localization, e.g., military use to locate the source(s) of artillery fire, aircraft location and tracking.
- an array is made up of omni-directional microphones, directional microphones, or a mix of omni-directional and directional microphones distributed about the perimeter of a space, linked to a computer that records and interprets the results into a coherent form.
- Arrays may also have one or more microphones in an interior area encompassed by the perimeter.
- Arrays may also be formed using numbers of very closely spaced microphones. Given a fixed physical relationship in space between the different individual microphone transducer array elements, simultaneous DSP (digital signal processor) processing of the signals from each of the individual microphone array elements can create one or more “virtual” microphones.
- Beamforming or spatial filtering is a signal processing technique used in sensor arrays for directional signal transmission or reception. This is achieved by combining elements in a phased array in such a way that signals at particular angles experience constructive interference while others experience destructive interference.
- a phased array is an array of antennas, microphones, or other sensors in which the relative phases of respective signals are set in such a way that the effective radiation pattern is reinforced in a desired direction and suppressed in undesired directions.
- the phase relationship may be adjusted for beam steering.
- Beamforming can be used at both the transmitting and receiving ends in order to achieve spatial selectivity.
- the improvement compared with omni-directional reception/transmission is known as the receive/transmit gain (or loss).
- Adaptive beamforming is used to detect and estimate a signal-of-interest at the output of a sensor array by means of optimal (e.g., least-squares) spatial filtering and interference rejection.
- a beamformer controls the phase and relative amplitude of the signal at each transmitter, in order to create a pattern of constructive and destructive interference in the wavefront.
- information from different sensors is combined in a way where the expected pattern of radiation is preferentially observed.
- a narrow band system typical of radars or wide microphone arrays, is one where the bandwidth is only a small fraction of the center frequency. With wide band systems this approximation no longer holds, which is typical in sonars.
- the signal from each sensor may be amplified by a different “weight.”
- Different weighting patterns e.g., Dolph-Chebyshev
- Dolph-Chebyshev can be used to achieve the desired sensitivity patterns.
- a main lobe is produced together with nulls and side lobes.
- the position of a null can be controlled. This is useful to ignore noise or jammers in one particular direction, while listening for events in other directions. A similar result can be obtained on transmission.
- Beamforming techniques can be broadly divided into two categories:
- an adaptive beamformer is able to automatically adapt its response to different situations. Some criterion has to be set up to allow the adaption to proceed such as minimizing the total noise output. Because of the variation of noise with frequency, in wide band systems it may be desirable to carry out the process in the frequency domain.
- Beamforming can be computationally intensive.
- Beamforming can be used to try to extract sound sources in a room, such as multiple speakers in the cocktail party problem. This requires the locations of the speakers to be known in advance, for example by using the time of arrival from the sources to mics in the array, and inferring the locations from the distances.
- beamforming systems include an array of spatially distributed sensor elements, such as antennas, sonar phones or microphones, and a data processing system for combining signals detected by the array.
- the data processor combines the signals to enhance the reception of signals from sources located at select locations relative to the sensor elements.
- the data processor “aims” the sensor array in the direction of the signal source.
- a linear microphone array uses two or more microphones to pick up the voice of a talker. Because one microphone is closer to the talker than the other microphone, there is a slight time delay between the two microphones.
- the data processor adds a time delay to the nearest microphone to coordinate these two microphones. By compensating for this time delay, the beamforming system enhances the reception of signals from the direction of the talker, and essentially aims the microphones at the talker.
- a beamforming apparatus may connect to an array of sensors, e.g. microphones that can detect signals generated from a signal source, such as the voice of a talker.
- the sensors can be spatially distributed in a linear, a two-dimensional array or a three-dimensional array, with a uniform or non-uniform spacing between sensors.
- a linear array is useful for an application where the sensor array is mounted on a wall or a podium talker is then free to move about a half-plane with an edge defined by the location of the array.
- Each sensor detects the voice audio signals of the talker and generates electrical response signals that represent these audio signals.
- An adaptive beamforming apparatus provides a signal processor that can dynamically determine the relative time delay between each of the audio signals detected by the sensors.
- a signal processor may include a phase alignment element that uses the time delays to align the frequency components of the audio signals.
- the signal processor has a summation element that adds together the aligned audio signals to increase the quality of the desired audio source while simultaneously attenuating sources having different delays relative to the sensor array. Because the relative time delays for a signal relate to the position of the signal source relative to the sensor array, the beamforming apparatus provides, in one aspect, a system that “aims” the sensor array at the talker to enhance the reception of signals generated at the location of the talker and to diminish the energy of signals generated at locations different from that of the desired talker's location. The practical application of a linear array is limited to situations which are either in a half plane or where knowledge of the direction to the source in not critical.
- a third sensor that is not co-linear with the first two sensors is sufficient to define a planar direction, also known as azimuth.
- Three sensors do not provide sufficient information to determine elevation of a signal source.
- At least a fourth sensor, not co-planar with the first three sensors is required to obtain sufficient information to determine a location in a three dimensional space.
- a change in the position and orientation of the sensor can result in the aforementioned dramatic effects even if the talker is not moving due to the change in relative position and orientation due to movement of the arrays.
- Knowledge of any change in the location and orientation of the array can compensate for the increase in computational resources and decrease in effectiveness of the location determination and sound isolation.
- U.S. Pat. No. 7,415,117 shows audio source location identification and isolation.
- Known systems rely on stationary microphone arrays.
- a position sensor is any device that permits position measurement. It can either be an absolute position sensor or a relative one.
- Position sensors can be linear, angular, or multi-axis. Examples of position sensors include: capacitive transducer, capacitive displacement sensor, eddy-current sensor, ultrasonic sensor, grating sensor, Hall effect sensor, inductive non-contact position sensors, laser Doppler vibrometer (optical), linear variable differential transformer (LVDT), multi-axis displacement transducer, photodiode array, piezo-electric transducer (piezo-electric), potentiometer, proximity sensor (optical), rotary encoder (angular), seismic displacement pick-up, and string potentiometer (also known as string potentiometer, string encoder, cable position transducer). Inertial position sensors are common in modern electronic devices.
- a gyroscope is a device used for measurement of angular velocity. Gyroscopes are available that can measure rotational velocity in 1, 2, or 3 directions. 3-axis gyroscopes are often implemented with a 3-axis accelerometer to provide a full 6 degree-of-freedom (DoF) motion tracking system.
- a gyroscopic sensor is a type of inertial position sensor that senses rate of rotational acceleration and may indicate roll, pitch, and yaw.
- An accelerometer is another common inertial position sensor.
- An accelerometer may measure proper acceleration, which is the acceleration it experiences relative to freefall and is the acceleration felt by people and objects. Accelerometers are available that can measure acceleration in one, two, or three orthogonal axes. The acceleration measurement has a variety of uses.
- the sensor can be implemented in a system that detects velocity, position, shock, vibration, or the acceleration of gravity to determine orientation.
- An accelerometer having two orthogonal sensors is capable of sensing pitch and roll. This is useful in capturing head movements.
- a third orthogonal sensor may be added to obtain orientation in three dimensional space. This is appropriate for the detection of pen angles, etc.
- the sensing capabilities of an inertial position sensor can detect changes in six degrees of spatial measurement freedom by the addition of three orthogonal gyroscopes to a three axis accelerometer.
- Magnetometers are devices that measure the strength and/or direction of a magnetic field. Because magnetic fields are defined by containing both a strength and direction (vector fields), magnetometers that measure just the strength or direction are called scalar magnetometers, while those that measure both are called vector magnetometers. Today, both scalar and vector magnetometers are commonly found in consumer electronics, such as tablets and cellular devices. In most cases, magnetometers are used to obtain directional information in three dimensions by being paired with accelerometers and gyroscopes. This device is called an inertial measurement unit “IMU” or a 9-axis position sensor.
- IMU inertial measurement unit
- a head-related transfer function is a response that characterizes how an ear receives a sound from a point in space; a pair of HRTFs for two ears can be used to synthesize a binaural sound that seems to come from a particular point in space. It is a transfer function, describing how a sound from a specific point will arrive at the ear (generally at the outer end of the auditory canal).
- HRTFs Some consumer home entertainment products designed to reproduce surround sound from stereo (two-speaker) headphones use HRTFs.
- Some forms of HRTF-processing have also been included in computer software to simulate surround sound playback from loudspeakers.
- the monaural cues come from the interaction between the sound source and the human anatomy, in which the original source sound is modified before it enters the ear canal for processing by the auditory system. These modifications encode the source location, and may be captured via an impulse response which relates the source location and the ear location. This impulse response is termed the head-related impulse response (HRIR). Convolution of an arbitrary source sound with the HRIR converts the sound to that which would have been heard by the listener if it had been played at the source location, with the listener's ear at the receiver location. HRIRs have been used to produce virtual surround sound.
- HRIR head-related impulse response
- the HRTF is the Fourier transform of HRIR.
- the HRTF is also sometimes known as the anatomical transfer function (ATF).
- HRTFs for left and right ear describe the filtering of a sound source (x(t)) before it is perceived at the left and right ears as xL(t) and xR(t), respectively.
- the HRTF can also be described as the modifications to a sound from a direction in free air to the sound as it arrives at the eardrum. These modifications include the shape of the listener's outer ear, the shape of the listener's head and body, the acoustic characteristics of the space in which the sound is played, and so on. All these characteristics will influence how (or whether) a listener can accurately tell what direction a sound is coming from. The associated mechanism varies between individuals, as their head and ear shapes differ.
- HRTF describes how a given sound wave input (parameterized as frequency and source location) is filtered by the diffraction and reflection properties of the head, pinna, and torso, before the sound reaches the transduction machinery of the eardrum and inner ear (see auditory system).
- the source-location-specific pre-filtering effects of these external structures aid in the neural determination of source location), particularly the determination of the source's elevation (see vertical sound localization).
- FFTF free-field transfer function
- DTF directional transfer function
- HRTF H(f) is the Fourier transform of the HRIR h(t).
- HRTF are complicated functions of frequency and the three spatial variables. For distances greater than 1 m from the head, however, the HRTF can be said to attenuate inversely with range. It is this far field HRTF, H(f, ⁇ , ⁇ ), that has most often been measured. At closer range, the difference in level observed between the ears can grow quite large, even in the low-frequency region within which negligible level differences are observed in the far field.
- HRTFs are typically measured in an anechoic chamber to minimize the influence of early reflections and reverberation on the measured response. HRTFs are measured at small increments of ⁇ such as 15° or 30° in the horizontal plane, with interpolation used to synthesize HRTFs for arbitrary positions of ⁇ . Even with small increments, however, interpolation can lead to front-back confusion, and optimizing the interpolation procedure is an active area of research.
- the head-related transfer function is involved in resolving the Cone of Confusion, a series of points where ITD and ILD are identical for sound sources from many locations around the “0” part of the cone.
- a sound When a sound is received by the ear it can either go straight down the ear into the ear canal or it can be reflected off the pinnae of the ear, into the ear canal a fraction of a second later.
- the sound will contain many frequencies, so therefore many copies of this signal will go down the ear all at different times depending on their frequency (according to reflection, diffraction, and their interaction with high and low frequencies and the size of the structures of the ear.) These copies overlap each other, and during this, certain signals are enhanced (where the phases of the signals match) while other copies are canceled out (where the phases of the signal do not match). Essentially, the brain is looking for frequency notches in the signal that correspond to particular known directions of sound.
- a basic assumption in the creation of a virtual auditory space is that if the acoustical waveforms present at a listener's eardrums are the same under headphones as in free field, then the listener's experience should also be the same.
- sounds generated from headphones appear to originate from within the head.
- the headphones should be able to “externalize” the sound.
- sounds can be spatially positioned using the technique described below.
- x 1 (t) represent an electrical signal driving a loudspeaker and y 1 (t) represent the signal received by a microphone inside the listener's eardrum.
- x 2 (t) represent the electrical signal driving a headphone and y 2 (t) represent the microphone response to the signal.
- L is the transfer function of the loudspeaker in the free field
- F is the HRTF
- M is the microphone transfer function
- H is the headphone-to-eardrum transfer function
- a scaling factor is a function of the anthropometric features. For example, a training set of N subjects would consider each HRTF phase and describe a single ITD scaling factor as the average delay of the group. This computed scaling factor can estimate the time delay as function of the direction and elevation for any given individual. Converting the time delay to phase response for the left and the right ears is trivial.
- the HRTF phase can be described by the ITD scaling factor. This is in turn is quantified by the anthropometric data of a given individual taken as the source of reference. For a generic case we consider ⁇ as a sparse vector
- ⁇ [ ⁇ 1 , ⁇ 2 , . . . , ⁇ N ] T
- ITD scaling factor value H′ is estimated as:
- H n,d,k corresponds to all the HRTFs of the training set are stacked in a new tensor H E R N ⁇ D ⁇ K , so the value H n,d,k corresponds to the k-th frequency bin for dth HRTF direction of the n-th person. Also H d,k corresponds to kth frequency for every d-th HRTF direction of the synthesized HRTF.
- HRTF such as in a computer gaming environment, such as with A3D, EAX and OpenAL, which approximates the HRTF of the listener, can be heard through stereo headphones or speakers and interpreted as if they comprise sounds coming from all directions, rather than just two points on either side of the head.
- the perceived accuracy of the result depends on how closely the HRTF data set matches the physiological structure of the listener's head/ears.
- An audio spatialization system is desirable for use in connection with a personal audio playback system such as headphones, earphones, and/or earbuds.
- the system is intended to operate so that a user can customize the audio information received through personal speakers.
- the system is capable of customizing the listening experience of a user and may include at least some portion of the ambient audio or artificially-generated position specific audio.
- the system may be provided so that the audio spatialization applied may maintain orientation with respect to a fixed frame of reference as the listener moves and tracks movement of an actual or apparent audio source even when the speakers and sensor are not maintained in the same relative position and orientation to the listener.
- the system may operate to identify and isolate audio emanating from a source located in a particular position.
- the isolated audio may be provided through an audio spatialization engine to a user's personal speakers maintaining the same orientation.
- the system is designed so that the apparent location of audio from a set of personal speakers can be configured to remain constant when a user and/or the sensors turn or move. For example, if the user turns to the right, the personal speakers will turn with the user.
- the system may apply a modification to the spatialization so that the apparent location of the audio source will be moved relative to the user, i.e., to the user's left and the user will perceive the audio source remaining stationary even while the user is moving relative to the source.
- This may be accomplished by motion sensors detecting changes in position or orientation of the user and modifying the audio spatialization in order to compensate for the change in location or orientation of the user, and in particular the ear speakers being used.
- the system may also use audio source tracking to detect movement of the audio source and to compensate so that the user will perceive the audio source motion.
- an augmented reality video game may be greatly enhanced by addition of directional audio.
- a game element may be assigned to a real world location.
- a player carrying a smart phone or personal communication device with a GPS or other position sensor may interact with game elements using application software on the personal communication device when in proximity to the game element.
- a position sensor in fixed orientation with the users head may be used to control specialization of audio coordinated with the location assigned to the game element.
- a user may be listening to music in an office, in a restaurant, at a sporting event or in any other environment in which there are multiple people speaking in various directions relative to the user.
- the user may be utilizing one or more detached microphone arrays or other sensors in order to identify and, when desired, stream certain sounds or voices to the user.
- the user may wish to quickly turn in the direction relative to the user from where the desired sound is emanating or from where the speaker is standing in order to show recognition to the speaker that he/she is heard and to focus visually in the direction of such sound source.
- the user may be wearing headphones, earphones, a hearable or assisted listening device incorporating or connected to a directional sensor, along with an ability to accurately reproduce sounds with a directional element (a straightforward function of such direction is to the left or right of a user, or a more complex function utilizing a 3D technology or spatial engine such as Realsound3D from Visisonics if the sound is from the front, back, or a different elevation relative to the user.)
- a position sensor in the external microphone array or sensor will synchronize with the position sensor of the user, thus enabling the user to hear the sounds in the user's ears as though the external sensor was being worn, even as it is detached from the user.
- An audio source signal may be connected to the audio spatialization system.
- the motion sensor associated with the personal speaker system may be connected to a listener position/orientation unit having an output connected to the audio spatialization engine representing position and orientation of the personal speaker system.
- the audio spatialization engine may add spatial characteristics to the output of the audio source on the basis of the output of the listen position/orientation unit and/or directional cues obtained from a directional cue reporting unit.
- An audio customization system may be provided to enhance a user's audio environment.
- An embodiment of the system may be implemented with a sensor (microphone) array that is not in a fixed location/direction relative to personal speakers.
- the audio spatialization system may include a personal speaker system with an input of an electrical signal which is converted to audio.
- An audio spatialization engine output is connected to the personal speaker system to apply a spatial or directional component to the audio being output by the personal speaker system.
- the directional cue reporting unit may include a location processor in turn connected to a beamforming unit, a beam steering unit and directionally discriminating acoustic sensor associated with the personal speaker system.
- the directionally discriminating acoustic sensor may be a microphone array.
- the association between the directionally discriminating acoustic sensor and the personal speaker system is such that there is a fixed or a known relationship between the position or orientation of the personal speaker system and the directionally discriminating acoustic sensor.
- a motion sensor also is arranged in a fixed or known position and orientation with respect to the personal speaker system.
- the audio spatialization engine may apply head related transfer functions to the audio source.
- An audio spatialization system may include a personal speaker system with an input representative of an audio input and an audio spatialization engine having an output representative of the audio output of the personal speaker system.
- An audio source having an output may be connected to the audio spatialization engine.
- a motion sensor may be associated with the personal speaker system.
- a listener position orientation unit may have an input connected to the motion sensor and an output connected to the audio spatialization engine representing the position and orientation of the personal speaker system.
- the audio spatialization engine may add spatial characteristics to the output of the audio source on the basis of the output of the listener position/orientation unit.
- the audio spatialization system may include a directional cue reporting unit having an output representative of a direction connected to the audio spatialization engine.
- the audio spatialization engine may add spatial characteristics to the output of the audio source on the added basis of the output representative of a direction of the directional cue reporting unit.
- the directional cue reporting unit may include a location processor connected to a beamforming unit; a beam steering unit and a directionally discriminating acoustic sensor associated with the personal speaker system.
- the directionally discriminating acoustic sensor may be a microphone array.
- the motion sensor may be an accelerometer, a gyroscope, and/or a magnetometer.
- the audio spatialization engine may apply head related transfer functions to the output of the audio source.
- FIG. 1 shows a pair of headphones with an embodiment of a microphone array.
- FIG. 2 shows a portable microphone array
- FIG. 3 shows a spatial audio processing system
- FIG. 4 shows a spatial audio processing system which may be used with non-ambient source information.
- FIG. 1 shows a pair of headphones which may be used in the system.
- the headphones 101 may include a headband 102 .
- the headband 102 may form an arc which, when in use, sits over the user's head.
- the headphones 101 may also include ear speakers 103 and 104 connected to the headband 102 .
- the ear speakers 103 and 104 are colloquially referred to as “cans.”
- a position sensor 106 may be mounted in the headphones, for example, in an ear speaker housing 103 or in a headband 102 (not shown).
- the position sensor 106 may be a 9-axis position sensor.
- the position sensor 106 may include a magnometer and/or an accelerometer.
- FIG. 2 shows a portable microphone array.
- the portable microphone array may be contained in a housing 200 .
- the configuration of the housing is not important to the operation.
- the housing may be a freestanding device.
- the housing 200 may be part of a personal communications device such as a cell phone or smart phone.
- the housing may be portable.
- the housing 200 may include a cover 201 .
- a plurality of microphones 202 may be arranged on the cover 201 .
- the plurality of microphones 202 may be positioned with any suitable geometric configuration.
- a linear arrangement is one possible geometric configuration.
- the plurality of microphones 202 may include three (3) or more non-co-linear microphones. Non-co-linear arrangement of three or more microphones is advantageous in that the microphone signals may be used by a beamformer for unambiguous determination of direction of arrival of point-generated audio.
- eight (8) microphones 202 may be provided which are equally spaced and define a circle.
- a central microphone 203 may also be provided to facilitate accurate source direction of arrival.
- the portable microphone array may also include a position sensor 204 .
- the position sensor may be a 9-axis position sensor.
- the position sensor 205 may include an absolute orientation sensor such as a magnometer.
- FIG. 3 shows a spatial audio processing system.
- the spatial audio processing system of FIG. 3 may operate on the assumption that the microphone array 301 is located in close proximity to the speakers 307 and the point audio source is located in a position that is not between the microphone array 301 and speakers 307 .
- a microphone array 301 may provide a multi-channel signal representative of the audio information sensed by multiple microphones to an audio analysis and processing unit 303 .
- An array position sensor 302 is fixably-linked to a microphone array 301 and generates a signal indicative of the orientation of the microphone array 301 .
- the audio analysis and processing unit 303 operates to generate one or more signals representative of one or more audio beams of interest.
- An example of an audio analysis and processing unit is described in co-pending U.S. patent application Ser. No. ______, Attorney Docket No. 111031 entitled, “Audio Analysis and Processing System”, filed on even date herewith and expressly incorporated by reference herein.
- the audio analysis and processing unit may generate a signal corresponding to the audio beam direction which is connected to the position accumulator 305 .
- the audio analysis and processing unit may use a beamformer to select a beam which includes audio information of interest or may include beam-steering capabilities to refine the direction of arrival of audio from an audio source.
- the speaker position sensor 304 may be fixed to speakers 307 and may generate a signal indicative of the speaker position.
- the signal indicative of the speaker position may be an absolute orientation signal such as may be generated by a magnometer.
- the speaker position sensor 304 may utilize gyroscopic and/or inertial sensors.
- the position accumulator 305 has inputs indicative of the microphone array orientation, the speaker orientation in the beam direction. This information is combined in order to determine the proper apparent direction of arrival of the audio information relative to the speaker position.
- the speaker 307 may be a personal speaker in fixed orientation relative to the user, for example, headphones or earphones.
- a spatial processor 306 may be provided to impart spatialization to the signal representing the audio beam.
- the spatial processor 306 may have an output which is a binaural spatialized audio signal connected to the speaker 307 which may be binaural speakers.
- the spatial processor 306 may apply a head-related transfer function to the signal representing the audio beam and generate a binaural output according to the direction determined by the position accumulator 305 .
- FIG. 4 shows a spatial audio processing system which may be used with non-ambient source information.
- the non-ambient source information may, for example, be used in augmented reality or virtual reality systems which are arranged to provide personal speakers with spatialized audio information.
- Elements in FIG. 4 which correlate to elements in FIG. 3 have been given the same reference numbers.
- An audio source system 401 may be a video game or other system which generates audio having a positional or directional frame of reference not fixed to the orientation of a personal speaker system 307 .
- the directional source information system includes a source position 402 output provided to a position accumulator 405 .
- the unit 401 also provides an audio output 403 which is intended to have an apparent direction of arrival indicated by source position 402 .
- a position accumulator 405 receives a signal indicative of the orientation of the speaker position sensor 304 , and a signal indicative of the intended orientation of direction of arrival of the source position 402 .
- the position accumulator 405 generates a signal indicative of the direction of arrival referenced to the orientation of the speakers 307 .
- the spatial processor 306 spatializes the directional source audio 403 in accordance with the output of the position accumulator 405 and has an output of a spatialized binaural signal having the proper orientation, connected to speakers 307 .
- a personal speaker system may be oriented in a north facing direction. If a microphone array is oriented in an east facing direction and the direction of arrival of an audio signal is 45° off of the facing direction of the microphone array, the position accumulator receives a signal representative of each orientation, namely 0° for north, 90° for east and 45° for the direction of arrival for a total of 135° (90 ⁇ 0+45) for the orientation of the apparent audio source relative to the orientation of the speakers.
- a motion detector such as Gyroscope, and/or a compass may be provided in connection with a microphone array. Because the microphone array is configured to be carried by a person, and because people move, a motion detector may be used to ascertain change in position and/or orientation of the microphone array.
- the techniques, processes and apparatus described may be utilized to control operation of any device and conserve use of resources based on conditions detected or applicable to the device.
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Circuit For Audible Band Transducer (AREA)
- Stereophonic System (AREA)
Abstract
Description
- This invention relates to an audio processing system and more particularly to an audio processing system that spatializes audio for output.
- WO 2016/090342 A2, published Jun. 9, 2016, the disclosure of which is expressly incorporated herein and which was made by the inventor of subject matter described herein, shows an adaptive audio spatialization system having an audio sensor array rigidly mounted to a personal speaker.
- It is known to use microphone arrays and beamforming technology in order to locate and isolate an audio source. Personal audio is typically delivered to a user by a personal speaker(s) such as headphones or earphones. Headphones are a pair of small speakers that are designed to be held in place close to a user's ears. They may be electroacoustic transducers which convert an electrical signal to a corresponding sound in the user's ear. Headphones are designed to allow a single user to listen to an audio source privately, in contrast to a loudspeaker which emits sound into the open air, allowing anyone nearby to listen. Earbuds or earphones are in-ear versions of headphones.
- A sensitive transducer element of a microphone is called its element or capsule. Except in thermophone based microphones, sound is first converted to mechanical motion [by] a diaphragm, the motion of which is then converted to an electrical signal. A complete microphone also includes a housing, some means of bringing the signal from the element to other equipment, and often an electronic circuit to adapt the output of the capsule to the equipment being driven. A wireless microphone contains a radio transmitter.
- The MEMS (MicroElectrical-Mechanical System) microphone is also called a microphone chip or silicon microphone. A pressure-sensitive diaphragm is etched directly into a silicon wafer by MEMS processing techniques, and is usually accompanied with integrated preamplifier. Most MEMS microphones are variants of the condenser microphone design. Digital MEMS microphones have built in analog-to-digital converter (ADC) circuits on the same CMOS chip making the chip a digital microphone and so more readily integrated with modern digital products. Major manufacturers producing MEMS silicon microphones are Wolfson Microelectronics (WM7xxx), Analog Devices, Akustica (AKU200x), Infineon (SMM310 product), Knowles Electronics, Memstech (MSMx), NXP Semiconductors, Sonion MEMS, Vesper, AAC Acoustic Technologies, and Omron.
- A microphone's directionality or polar pattern indicates how sensitive it is to sounds arriving at different angles about its central axis. The polar pattern represents the locus of points that produce the same signal level output in the microphone if a given sound pressure level (SPL) is generated from that point. How the physical body of the microphone is oriented relative to the diagrams depends on the microphone design. Large-membrane microphones are often known as “side fire” or “side address” on the basis of the sideward orientation of their directionality. Small diaphragm microphones are commonly known as “end fire” or “top/end address” on the basis of the orientation of their directionality.
- Some microphone designs combine several principles in creating the desired polar pattern. This ranges from shielding (meaning diffraction/dissipation/absorption) by the housing itself to electronically combining dual membranes.
- An omni-directional (or non-directional) microphone's response is generally considered to be a perfect sphere in three dimensions. In the real world, this is not the case. As with directional microphones, the polar pattern for an “omni-directional” microphone is a function of frequency. The body of the microphone is not infinitely small and, as a consequence, it tends to get in its own way with respect to sounds arriving from the rear, causing a slight flattening of the polar response. This flattening increases as the diameter of the microphone (assuming it's cylindrical) reaches the wavelength of the frequency in question.
- A unidirectional microphone is sensitive to sounds from only one direction
- A noise-canceling microphone is a highly directional design intended for noisy environments. One such use is in aircraft cockpits where they are normally installed as boom microphones on headsets. Another use is in live event support on loud concert stages for vocalists involved with live performances. Many noise-canceling microphones combine signals received from two diaphragms that are in opposite electrical polarity or are processed electronically. In dual diaphragm designs, the main diaphragm is mounted closest to the intended source and the second is positioned farther away from the source so that it can pick up environmental sounds to be subtracted from the main diaphragm's signal. After the two signals have been combined, sounds other than the intended source are greatly reduced, substantially increasing intelligibility. Other noise-canceling designs use one diaphragm that is affected by ports open to the sides and rear of the microphone.
- Sensitivity indicates how well the microphone converts acoustic pressure to output voltage. A high sensitivity microphone creates more voltage and so needs less amplification at the mixer or recording device. This is a practical concern but is not directly an indication of the microphone's quality, and in fact the term sensitivity is something of a misnomer, “transduction gain” being perhaps more meaningful, (or just “output level”) because true sensitivity is generally set by the noise floor, and too much “sensitivity” in terms of output level compromises the clipping level.
- A microphone array is any number of microphones operating in tandem. Microphone arrays may be used in systems for extracting voice input from ambient noise (notably telephones, speech recognition systems, and hearing aids), surround sound and related technologies, binaural recording, locating objects by sound: acoustic source localization, e.g., military use to locate the source(s) of artillery fire, aircraft location and tracking.
- Typically, an array is made up of omni-directional microphones, directional microphones, or a mix of omni-directional and directional microphones distributed about the perimeter of a space, linked to a computer that records and interprets the results into a coherent form. Arrays may also have one or more microphones in an interior area encompassed by the perimeter. Arrays may also be formed using numbers of very closely spaced microphones. Given a fixed physical relationship in space between the different individual microphone transducer array elements, simultaneous DSP (digital signal processor) processing of the signals from each of the individual microphone array elements can create one or more “virtual” microphones.
- Beamforming or spatial filtering is a signal processing technique used in sensor arrays for directional signal transmission or reception. This is achieved by combining elements in a phased array in such a way that signals at particular angles experience constructive interference while others experience destructive interference. A phased array is an array of antennas, microphones, or other sensors in which the relative phases of respective signals are set in such a way that the effective radiation pattern is reinforced in a desired direction and suppressed in undesired directions. The phase relationship may be adjusted for beam steering. Beamforming can be used at both the transmitting and receiving ends in order to achieve spatial selectivity. The improvement compared with omni-directional reception/transmission is known as the receive/transmit gain (or loss).
- Adaptive beamforming is used to detect and estimate a signal-of-interest at the output of a sensor array by means of optimal (e.g., least-squares) spatial filtering and interference rejection.
- To change the directionality of the array when transmitting, a beamformer controls the phase and relative amplitude of the signal at each transmitter, in order to create a pattern of constructive and destructive interference in the wavefront. When receiving, information from different sensors is combined in a way where the expected pattern of radiation is preferentially observed.
- With narrow-band systems the time delay is equivalent to a “phase shift”, so in the case of a sensor array, each sensor output is shifted a slightly different amount. This is called a phased array. A narrow band system, typical of radars or wide microphone arrays, is one where the bandwidth is only a small fraction of the center frequency. With wide band systems this approximation no longer holds, which is typical in sonars.
- In the receive beamformer the signal from each sensor may be amplified by a different “weight.” Different weighting patterns (e.g., Dolph-Chebyshev) can be used to achieve the desired sensitivity patterns. A main lobe is produced together with nulls and side lobes. As well as controlling the main lobe width (the beam) and the side lobe levels, the position of a null can be controlled. This is useful to ignore noise or jammers in one particular direction, while listening for events in other directions. A similar result can be obtained on transmission.
- Beamforming techniques can be broadly divided into two categories:
-
- a. conventional (fixed or switched beam) beamformers
- b. adaptive beamformers or phased array
- i. desired signal maximization mode
- ii. interference signal minimization or cancellation mode
- Conventional beamformers use a fixed set of weightings and time-delays (or phasings) to combine the signals from the sensors in the array, primarily using only information about the location of the sensors in space and the wave directions of interest. In contrast, adaptive beamforming techniques generally combine this information with properties of the signals actually received by the array, typically to improve rejection of unwanted signals from other directions. This process may be carried out in either the time or the frequency domain.
- As the name indicates, an adaptive beamformer is able to automatically adapt its response to different situations. Some criterion has to be set up to allow the adaption to proceed such as minimizing the total noise output. Because of the variation of noise with frequency, in wide band systems it may be desirable to carry out the process in the frequency domain.
- Beamforming can be computationally intensive.
- Beamforming can be used to try to extract sound sources in a room, such as multiple speakers in the cocktail party problem. This requires the locations of the speakers to be known in advance, for example by using the time of arrival from the sources to mics in the array, and inferring the locations from the distances.
- A Primer on Digital Beamforming by Toby Haynes, Mar. 26, 1998 http://www.spectrumsignal.com/publications/beamform_primer.pdf describes beam forming technology.
- According to U.S. Pat. No. 5,581,620, the disclosure of which is incorporated by reference herein, many communication systems, such as radar systems, sonar systems and microphone arrays, use beamforming to enhance the reception of signals. In contrast to conventional communication systems that do not discriminate between signals based on the position of the signal source, beamforming systems are characterized by the capability of enhancing the reception of signals generated from sources at specific locations relative to the system.
- Generally, beamforming systems include an array of spatially distributed sensor elements, such as antennas, sonar phones or microphones, and a data processing system for combining signals detected by the array. The data processor combines the signals to enhance the reception of signals from sources located at select locations relative to the sensor elements. Essentially, the data processor “aims” the sensor array in the direction of the signal source. For example, a linear microphone array uses two or more microphones to pick up the voice of a talker. Because one microphone is closer to the talker than the other microphone, there is a slight time delay between the two microphones. The data processor adds a time delay to the nearest microphone to coordinate these two microphones. By compensating for this time delay, the beamforming system enhances the reception of signals from the direction of the talker, and essentially aims the microphones at the talker.
- A beamforming apparatus may connect to an array of sensors, e.g. microphones that can detect signals generated from a signal source, such as the voice of a talker. The sensors can be spatially distributed in a linear, a two-dimensional array or a three-dimensional array, with a uniform or non-uniform spacing between sensors. A linear array is useful for an application where the sensor array is mounted on a wall or a podium talker is then free to move about a half-plane with an edge defined by the location of the array. Each sensor detects the voice audio signals of the talker and generates electrical response signals that represent these audio signals. An adaptive beamforming apparatus provides a signal processor that can dynamically determine the relative time delay between each of the audio signals detected by the sensors. Further, a signal processor may include a phase alignment element that uses the time delays to align the frequency components of the audio signals. The signal processor has a summation element that adds together the aligned audio signals to increase the quality of the desired audio source while simultaneously attenuating sources having different delays relative to the sensor array. Because the relative time delays for a signal relate to the position of the signal source relative to the sensor array, the beamforming apparatus provides, in one aspect, a system that “aims” the sensor array at the talker to enhance the reception of signals generated at the location of the talker and to diminish the energy of signals generated at locations different from that of the desired talker's location. The practical application of a linear array is limited to situations which are either in a half plane or where knowledge of the direction to the source in not critical. The addition of a third sensor that is not co-linear with the first two sensors is sufficient to define a planar direction, also known as azimuth. Three sensors do not provide sufficient information to determine elevation of a signal source. At least a fourth sensor, not co-planar with the first three sensors is required to obtain sufficient information to determine a location in a three dimensional space.
- Although these systems work well if the position of the signal source is precisely known, the effectiveness of these systems drops off dramatically and computational resources required increases dramatically with slight errors in the estimated a priori information. For instance, in some systems with source-location schemes, it has been shown that the data processor must know the location of the source within a few centimeters to enhance the reception of signals. Therefore, these systems require precise knowledge of the position of the source, and precise knowledge of the position of the sensors. As a consequence, these systems require both that the sensor elements in the array have a known and static spatial distribution and that the signal source remains stationary relative to the sensor array. Furthermore, these beamforming systems require a first step for determining the talker position and a second step for aiming the sensor array based on the expected position of the talker.
- A change in the position and orientation of the sensor can result in the aforementioned dramatic effects even if the talker is not moving due to the change in relative position and orientation due to movement of the arrays. Knowledge of any change in the location and orientation of the array can compensate for the increase in computational resources and decrease in effectiveness of the location determination and sound isolation.
- U.S. Pat. No. 7,415,117 shows audio source location identification and isolation. Known systems rely on stationary microphone arrays.
- A position sensor is any device that permits position measurement. It can either be an absolute position sensor or a relative one. Position sensors can be linear, angular, or multi-axis. Examples of position sensors include: capacitive transducer, capacitive displacement sensor, eddy-current sensor, ultrasonic sensor, grating sensor, Hall effect sensor, inductive non-contact position sensors, laser Doppler vibrometer (optical), linear variable differential transformer (LVDT), multi-axis displacement transducer, photodiode array, piezo-electric transducer (piezo-electric), potentiometer, proximity sensor (optical), rotary encoder (angular), seismic displacement pick-up, and string potentiometer (also known as string potentiometer, string encoder, cable position transducer). Inertial position sensors are common in modern electronic devices.
- A gyroscope is a device used for measurement of angular velocity. Gyroscopes are available that can measure rotational velocity in 1, 2, or 3 directions. 3-axis gyroscopes are often implemented with a 3-axis accelerometer to provide a full 6 degree-of-freedom (DoF) motion tracking system. A gyroscopic sensor is a type of inertial position sensor that senses rate of rotational acceleration and may indicate roll, pitch, and yaw.
- An accelerometer is another common inertial position sensor. An accelerometer may measure proper acceleration, which is the acceleration it experiences relative to freefall and is the acceleration felt by people and objects. Accelerometers are available that can measure acceleration in one, two, or three orthogonal axes. The acceleration measurement has a variety of uses. The sensor can be implemented in a system that detects velocity, position, shock, vibration, or the acceleration of gravity to determine orientation. An accelerometer having two orthogonal sensors is capable of sensing pitch and roll. This is useful in capturing head movements. A third orthogonal sensor may be added to obtain orientation in three dimensional space. This is appropriate for the detection of pen angles, etc. The sensing capabilities of an inertial position sensor can detect changes in six degrees of spatial measurement freedom by the addition of three orthogonal gyroscopes to a three axis accelerometer.
- Magnetometers are devices that measure the strength and/or direction of a magnetic field. Because magnetic fields are defined by containing both a strength and direction (vector fields), magnetometers that measure just the strength or direction are called scalar magnetometers, while those that measure both are called vector magnetometers. Today, both scalar and vector magnetometers are commonly found in consumer electronics, such as tablets and cellular devices. In most cases, magnetometers are used to obtain directional information in three dimensions by being paired with accelerometers and gyroscopes. This device is called an inertial measurement unit “IMU” or a 9-axis position sensor.
- A head-related transfer function (HRTF) is a response that characterizes how an ear receives a sound from a point in space; a pair of HRTFs for two ears can be used to synthesize a binaural sound that seems to come from a particular point in space. It is a transfer function, describing how a sound from a specific point will arrive at the ear (generally at the outer end of the auditory canal). Some consumer home entertainment products designed to reproduce surround sound from stereo (two-speaker) headphones use HRTFs. Some forms of HRTF-processing have also been included in computer software to simulate surround sound playback from loudspeakers.
- Humans have just two ears, but can locate sounds in three dimensions—in range (distance), in direction above and below, in front and to the rear, as well as to either side. This is possible because the brain, inner ear and the external ears (pinna) work together to make inferences about location. This ability to localize sound sources may have developed in humans and ancestors as an evolutionary necessity, since the eyes can only see a fraction of the world around a viewer, and vision is hampered in darkness, while the ability to localize a sound source works in all directions, to varying accuracy, regardless of the surrounding light.
- Humans estimate the location of a source by taking cues derived from one ear (monaural cues), and by comparing cues received at both ears (difference cues or binaural cues). Among the difference cues are time differences of arrival and intensity differences. The monaural cues come from the interaction between the sound source and the human anatomy, in which the original source sound is modified before it enters the ear canal for processing by the auditory system. These modifications encode the source location, and may be captured via an impulse response which relates the source location and the ear location. This impulse response is termed the head-related impulse response (HRIR). Convolution of an arbitrary source sound with the HRIR converts the sound to that which would have been heard by the listener if it had been played at the source location, with the listener's ear at the receiver location. HRIRs have been used to produce virtual surround sound.
- The HRTF is the Fourier transform of HRIR. The HRTF is also sometimes known as the anatomical transfer function (ATF).
- HRTFs for left and right ear (expressed above as HRIRs) describe the filtering of a sound source (x(t)) before it is perceived at the left and right ears as xL(t) and xR(t), respectively.
- The HRTF can also be described as the modifications to a sound from a direction in free air to the sound as it arrives at the eardrum. These modifications include the shape of the listener's outer ear, the shape of the listener's head and body, the acoustic characteristics of the space in which the sound is played, and so on. All these characteristics will influence how (or whether) a listener can accurately tell what direction a sound is coming from. The associated mechanism varies between individuals, as their head and ear shapes differ.
- HRTF describes how a given sound wave input (parameterized as frequency and source location) is filtered by the diffraction and reflection properties of the head, pinna, and torso, before the sound reaches the transduction machinery of the eardrum and inner ear (see auditory system). Biologically, the source-location-specific pre-filtering effects of these external structures aid in the neural determination of source location), particularly the determination of the source's elevation (see vertical sound localization).
- Linear systems analysis defines the transfer function as the complex ratio between the output signal spectrum and the input signal spectrum as a function of frequency. Blauert (1974; cited in Blauert, 1981) initially defined the transfer function as the free-field transfer function (FFTF). Other terms include free-field to eardrum transfer function and the pressure transformation from the free-field to the eardrum. Less specific descriptions include the pinna transfer function, the outer ear transfer function, the pinna response, or directional transfer function (DTF).
- The transfer function H(f) of any linear time-invariant system at frequency f is:
-
H(f)=Output(f)/Input(f) - One method used to obtain the HRTF from a given source location is therefore to measure the head-related impulse response (HRIR), h(t), at the ear drum for the impulse Δ(t) placed at the source. The HRTF H(f) is the Fourier transform of the HRIR h(t).
- Even when measured for a “dummy head” of idealized geometry, HRTF are complicated functions of frequency and the three spatial variables. For distances greater than 1 m from the head, however, the HRTF can be said to attenuate inversely with range. It is this far field HRTF, H(f, θ, φ), that has most often been measured. At closer range, the difference in level observed between the ears can grow quite large, even in the low-frequency region within which negligible level differences are observed in the far field.
- HRTFs are typically measured in an anechoic chamber to minimize the influence of early reflections and reverberation on the measured response. HRTFs are measured at small increments of θ such as 15° or 30° in the horizontal plane, with interpolation used to synthesize HRTFs for arbitrary positions of θ. Even with small increments, however, interpolation can lead to front-back confusion, and optimizing the interpolation procedure is an active area of research.
- In order to maximize the signal-to-noise ratio (SNR) in a measured HRTF, it is important that the impulse being generated be of high volume. In practice, however, it can be difficult to generate impulses at high volumes and, if generated, they can be damaging to human ears, so it is more common for HRTFs to be directly calculated in the frequency domain using a frequency-swept sine wave or by using maximum length sequences. User fatigue is still a problem, however, highlighting the need for the ability to interpolate based on fewer measurements.
- The head-related transfer function is involved in resolving the Cone of Confusion, a series of points where ITD and ILD are identical for sound sources from many locations around the “0” part of the cone. When a sound is received by the ear it can either go straight down the ear into the ear canal or it can be reflected off the pinnae of the ear, into the ear canal a fraction of a second later. The sound will contain many frequencies, so therefore many copies of this signal will go down the ear all at different times depending on their frequency (according to reflection, diffraction, and their interaction with high and low frequencies and the size of the structures of the ear.) These copies overlap each other, and during this, certain signals are enhanced (where the phases of the signals match) while other copies are canceled out (where the phases of the signal do not match). Essentially, the brain is looking for frequency notches in the signal that correspond to particular known directions of sound.
- If another person's ears were substituted, the individual would not immediately be able to localize sound, as the patterns of enhancement and cancellation would be different from those patterns the person's auditory system is used to. However, after some weeks, the auditory system would adapt to the new head-related transfer function. The inter-subject variability in the spectra of HRTFs has been studied through cluster analyses.
- Assessing the variation through changes between the person's ears, we can limit our perspective with the degrees of freedom of the head and its relation with the spatial domain. Through this, we eliminate the tilt and other co-ordinate parameters that add complexity. For the purpose of calibration we are only concerned with the direction level to our ears, ergo a specific degree of freedom. Some of the ways in which we can deduce an expression to calibrate the HRTF are:
- 1. Localization of sound in Virtual Auditory space
- 2. HRTF Phase synthesis
- 3. HRTF Magnitude synthesis
- A basic assumption in the creation of a virtual auditory space is that if the acoustical waveforms present at a listener's eardrums are the same under headphones as in free field, then the listener's experience should also be the same.
- Typically, sounds generated from headphones appear to originate from within the head. In the virtual auditory space, the headphones should be able to “externalize” the sound. Using the HRTF, sounds can be spatially positioned using the technique described below.
- Let x1(t) represent an electrical signal driving a loudspeaker and y1(t) represent the signal received by a microphone inside the listener's eardrum. Similarly, let x2(t) represent the electrical signal driving a headphone and y2(t) represent the microphone response to the signal. The goal of the virtual auditory space is to choose x2(t) such that y2(t)=y1(t). Applying the Fourier transform to these signals, we come up with the following two equations:
-
Y 1 =X 1 LFM, and -
Y 2 =X 2 HM, - where L is the transfer function of the loudspeaker in the free field, F is the HRTF, M is the microphone transfer function, and H is the headphone-to-eardrum transfer function.
- Setting Y1=Y2, and solving for X2 yields: X2=X1LF/H.
- By observation, the desired transfer function is: T=LF/H.
- Therefore, theoretically, if x1(t) is passed through this filter and the resulting x2(t) is played on the headphones, it should produce the same signal at the eardrum. Since the filter applies only to a single ear, another one must be derived for the other ear. This process is repeated for many places in the virtual environment to create an array of head-related transfer functions for each position to be recreated while ensuring that the sampling conditions are set by the Nyquist criteria.
- There is less reliable phase estimation in the very low part of the frequency band, and in the upper frequencies the phase response is affected by the features of the pinna. Earlier studies also show that the HRTF phase response is mostly linear and that listeners are insensitive to the details of the interaural phase spectrum as long as the interaural time delay (ITD) of the combined low-frequency part of the waveform is maintained. This is the modeled phase response of the subject HRTF as a time delay, dependent on the direction and elevation.
- A scaling factor is a function of the anthropometric features. For example, a training set of N subjects would consider each HRTF phase and describe a single ITD scaling factor as the average delay of the group. This computed scaling factor can estimate the time delay as function of the direction and elevation for any given individual. Converting the time delay to phase response for the left and the right ears is trivial.
- The HRTF phase can be described by the ITD scaling factor. This is in turn is quantified by the anthropometric data of a given individual taken as the source of reference. For a generic case we consider β as a sparse vector
-
β=[β1,β2, . . . ,βN]T - that represents the subject's anthropometric features as a linear superposition of the anthropometric features from the training data (y′=βT X), and then apply the same sparse vector directly on the scaling vector H. We can write this task as a minimization problem, for a non-negative shrinking parameter λ:
-
- From this, ITD scaling factor value H′ is estimated as:
-
- where the ITD scaling factors for all persons in the dataset are stacked in a vector H E RN, so the value Hn corresponds to the scaling factor of the n-th person.
- We solve the above minimization problem using Least Absolute Shrinkage and Selection Operator (LASSO). We assume that the HRTFs are represented by the same relation as the anthropometric features. Therefore, once we learn the sparse vector β from the anthropometric features, we directly apply it to the HRTF tensor data and the subject's HRTF values H′ given by:
-
- where the HRTFs for each subject are described by a tensor of size D×K, where D is the number of HRTF directions and K is the number of frequency bins. All Hn,d,k corresponds to all the HRTFs of the training set are stacked in a new tensor H E RN×D×K, so the value Hn,d,k corresponds to the k-th frequency bin for dth HRTF direction of the n-th person. Also Hd,k corresponds to kth frequency for every d-th HRTF direction of the synthesized HRTF.
- Recordings processed via an HRTF, such as in a computer gaming environment, such as with A3D, EAX and OpenAL, which approximates the HRTF of the listener, can be heard through stereo headphones or speakers and interpreted as if they comprise sounds coming from all directions, rather than just two points on either side of the head. The perceived accuracy of the result depends on how closely the HRTF data set matches the physiological structure of the listener's head/ears.
- An audio spatialization system is desirable for use in connection with a personal audio playback system such as headphones, earphones, and/or earbuds. The system is intended to operate so that a user can customize the audio information received through personal speakers. The system is capable of customizing the listening experience of a user and may include at least some portion of the ambient audio or artificially-generated position specific audio. The system may be provided so that the audio spatialization applied may maintain orientation with respect to a fixed frame of reference as the listener moves and tracks movement of an actual or apparent audio source even when the speakers and sensor are not maintained in the same relative position and orientation to the listener. For example, the system may operate to identify and isolate audio emanating from a source located in a particular position. The isolated audio may be provided through an audio spatialization engine to a user's personal speakers maintaining the same orientation. The system is designed so that the apparent location of audio from a set of personal speakers can be configured to remain constant when a user and/or the sensors turn or move. For example, if the user turns to the right, the personal speakers will turn with the user. The system may apply a modification to the spatialization so that the apparent location of the audio source will be moved relative to the user, i.e., to the user's left and the user will perceive the audio source remaining stationary even while the user is moving relative to the source. This may be accomplished by motion sensors detecting changes in position or orientation of the user and modifying the audio spatialization in order to compensate for the change in location or orientation of the user, and in particular the ear speakers being used. The system may also use audio source tracking to detect movement of the audio source and to compensate so that the user will perceive the audio source motion.
- In one use case, an augmented reality video game may be greatly enhanced by addition of directional audio. For example, in an augmented reality game, a game element may be assigned to a real world location. A player carrying a smart phone or personal communication device with a GPS or other position sensor may interact with game elements using application software on the personal communication device when in proximity to the game element. According to an embodiment of the disclosed system, a position sensor in fixed orientation with the users head may be used to control specialization of audio coordinated with the location assigned to the game element.
- In one use case, a user may be listening to music in an office, in a restaurant, at a sporting event or in any other environment in which there are multiple people speaking in various directions relative to the user. The user may be utilizing one or more detached microphone arrays or other sensors in order to identify and, when desired, stream certain sounds or voices to the user. The user may wish to quickly turn in the direction relative to the user from where the desired sound is emanating or from where the speaker is standing in order to show recognition to the speaker that he/she is heard and to focus visually in the direction of such sound source. The user may be wearing headphones, earphones, a hearable or assisted listening device incorporating or connected to a directional sensor, along with an ability to accurately reproduce sounds with a directional element (a straightforward function of such direction is to the left or right of a user, or a more complex function utilizing a 3D technology or spatial engine such as Realsound3D from Visisonics if the sound is from the front, back, or a different elevation relative to the user.) According to an embodiment of the disclosed system, a position sensor in the external microphone array or sensor will synchronize with the position sensor of the user, thus enabling the user to hear the sounds in the user's ears as though the external sensor was being worn, even as it is detached from the user.
- An audio source signal may be connected to the audio spatialization system. The motion sensor associated with the personal speaker system may be connected to a listener position/orientation unit having an output connected to the audio spatialization engine representing position and orientation of the personal speaker system. The audio spatialization engine may add spatial characteristics to the output of the audio source on the basis of the output of the listen position/orientation unit and/or directional cues obtained from a directional cue reporting unit.
- An audio customization system may be provided to enhance a user's audio environment. An embodiment of the system may be implemented with a sensor (microphone) array that is not in a fixed location/direction relative to personal speakers.
- It is an object to apply directional information to audio presented to a personal speaker such as headphones or earbuds and to modify the spatial characteristics of the audio in response to changes in position or orientation of the personal speaker system and/or audio sensors. The audio spatialization system may include a personal speaker system with an input of an electrical signal which is converted to audio. An audio spatialization engine output is connected to the personal speaker system to apply a spatial or directional component to the audio being output by the personal speaker system. The directional cue reporting unit may include a location processor in turn connected to a beamforming unit, a beam steering unit and directionally discriminating acoustic sensor associated with the personal speaker system. The directionally discriminating acoustic sensor may be a microphone array. The association between the directionally discriminating acoustic sensor and the personal speaker system is such that there is a fixed or a known relationship between the position or orientation of the personal speaker system and the directionally discriminating acoustic sensor. A motion sensor also is arranged in a fixed or known position and orientation with respect to the personal speaker system. The audio spatialization engine may apply head related transfer functions to the audio source.
- An audio spatialization system may include a personal speaker system with an input representative of an audio input and an audio spatialization engine having an output representative of the audio output of the personal speaker system. An audio source having an output may be connected to the audio spatialization engine. A motion sensor may be associated with the personal speaker system. A listener position orientation unit may have an input connected to the motion sensor and an output connected to the audio spatialization engine representing the position and orientation of the personal speaker system. The audio spatialization engine may add spatial characteristics to the output of the audio source on the basis of the output of the listener position/orientation unit. The audio spatialization system may include a directional cue reporting unit having an output representative of a direction connected to the audio spatialization engine. The audio spatialization engine may add spatial characteristics to the output of the audio source on the added basis of the output representative of a direction of the directional cue reporting unit. The directional cue reporting unit may include a location processor connected to a beamforming unit; a beam steering unit and a directionally discriminating acoustic sensor associated with the personal speaker system. The directionally discriminating acoustic sensor may be a microphone array. The motion sensor may be an accelerometer, a gyroscope, and/or a magnetometer. The audio spatialization engine may apply head related transfer functions to the output of the audio source.
- Various objects, features, aspects, and advantages of the present invention will become more apparent from the following detailed description of preferred embodiments of the invention, along with the accompanying drawings in which like numerals represent like components.
- Moreover, the above objects and advantages of the invention are illustrative, and not exhaustive, of those that can be achieved by the invention. Thus, these and other objects and advantages of the invention will be apparent from the description herein, both as embodied herein and as modified in view of any variations which will be apparent to those skilled in the art.
-
FIG. 1 shows a pair of headphones with an embodiment of a microphone array. -
FIG. 2 shows a portable microphone array. -
FIG. 3 shows a spatial audio processing system. -
FIG. 4 shows a spatial audio processing system which may be used with non-ambient source information. - Before the present invention is described in further detail, it is to be understood that the invention is not limited to the particular embodiments described, as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting, since the scope of the present invention will be limited only by the appended claims.
- Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range is encompassed within the invention. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges is also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the invention.
- Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the present invention, a limited number of the exemplary methods and materials are described herein.
- It must be noted that as used herein and in the appended claims, the singular forms “a”, “an”, and “the” include plural referents unless the context clearly dictates otherwise. For the sake of clarity, D/A and ND conversions and specification of hardware or software driven processing may not be specified if it is well understood by those of ordinary skill in the art. The scope of the disclosures should be understood to include analog processing and/or digital processing and hardware and/or software driven components.
- All publications mentioned herein are incorporated herein by reference to disclose and describe the methods and/or materials in connection with which the publications are cited. The publications discussed herein are provided solely for their disclosure prior to the filing date of the present application. Nothing herein is to be construed as an admission that the present invention is not entitled to antedate such publication by virtue of prior invention. Further, the dates of publication provided may be different from the actual publication dates, which may need to be independently confirmed.
-
FIG. 1 shows a pair of headphones which may be used in the system. - The
headphones 101 may include aheadband 102. Theheadband 102 may form an arc which, when in use, sits over the user's head. Theheadphones 101 may also includeear speakers headband 102. Theear speakers - A
position sensor 106 may be mounted in the headphones, for example, in anear speaker housing 103 or in a headband 102 (not shown). Theposition sensor 106 may be a 9-axis position sensor. Theposition sensor 106 may include a magnometer and/or an accelerometer. -
FIG. 2 shows a portable microphone array. The portable microphone array may be contained in ahousing 200. The configuration of the housing is not important to the operation. The housing may be a freestanding device. Alternatively, thehousing 200 may be part of a personal communications device such as a cell phone or smart phone. The housing may be portable. Thehousing 200 may include acover 201. A plurality ofmicrophones 202 may be arranged on thecover 201. The plurality ofmicrophones 202 may be positioned with any suitable geometric configuration. A linear arrangement is one possible geometric configuration. Advantageously, the plurality ofmicrophones 202 may include three (3) or more non-co-linear microphones. Non-co-linear arrangement of three or more microphones is advantageous in that the microphone signals may be used by a beamformer for unambiguous determination of direction of arrival of point-generated audio. - According to an embodiment, eight (8)
microphones 202 may be provided which are equally spaced and define a circle. Acentral microphone 203 may also be provided to facilitate accurate source direction of arrival. The portable microphone array may also include aposition sensor 204. The position sensor may be a 9-axis position sensor. The position sensor 205 may include an absolute orientation sensor such as a magnometer. -
FIG. 3 shows a spatial audio processing system. The spatial audio processing system ofFIG. 3 may operate on the assumption that themicrophone array 301 is located in close proximity to thespeakers 307 and the point audio source is located in a position that is not between themicrophone array 301 andspeakers 307. Amicrophone array 301 may provide a multi-channel signal representative of the audio information sensed by multiple microphones to an audio analysis andprocessing unit 303. Anarray position sensor 302 is fixably-linked to amicrophone array 301 and generates a signal indicative of the orientation of themicrophone array 301. The audio analysis andprocessing unit 303 operates to generate one or more signals representative of one or more audio beams of interest. An example of an audio analysis and processing unit is described in co-pending U.S. patent application Ser. No. ______, Attorney Docket No. 111031 entitled, “Audio Analysis and Processing System”, filed on even date herewith and expressly incorporated by reference herein. - The audio analysis and processing unit may generate a signal corresponding to the audio beam direction which is connected to the
position accumulator 305. The audio analysis and processing unit may use a beamformer to select a beam which includes audio information of interest or may include beam-steering capabilities to refine the direction of arrival of audio from an audio source. - The
speaker position sensor 304 may be fixed tospeakers 307 and may generate a signal indicative of the speaker position. The signal indicative of the speaker position may be an absolute orientation signal such as may be generated by a magnometer. Thespeaker position sensor 304 may utilize gyroscopic and/or inertial sensors. Theposition accumulator 305 has inputs indicative of the microphone array orientation, the speaker orientation in the beam direction. This information is combined in order to determine the proper apparent direction of arrival of the audio information relative to the speaker position. Thespeaker 307 may be a personal speaker in fixed orientation relative to the user, for example, headphones or earphones. Aspatial processor 306 may be provided to impart spatialization to the signal representing the audio beam. Thespatial processor 306 may have an output which is a binaural spatialized audio signal connected to thespeaker 307 which may be binaural speakers. Thespatial processor 306 may apply a head-related transfer function to the signal representing the audio beam and generate a binaural output according to the direction determined by theposition accumulator 305. -
FIG. 4 shows a spatial audio processing system which may be used with non-ambient source information. The non-ambient source information may, for example, be used in augmented reality or virtual reality systems which are arranged to provide personal speakers with spatialized audio information. Elements inFIG. 4 which correlate to elements inFIG. 3 have been given the same reference numbers. Anaudio source system 401 may be a video game or other system which generates audio having a positional or directional frame of reference not fixed to the orientation of apersonal speaker system 307. The directional source information system includes asource position 402 output provided to aposition accumulator 405. Theunit 401 also provides anaudio output 403 which is intended to have an apparent direction of arrival indicated bysource position 402. Aposition accumulator 405 receives a signal indicative of the orientation of thespeaker position sensor 304, and a signal indicative of the intended orientation of direction of arrival of thesource position 402. Theposition accumulator 405 generates a signal indicative of the direction of arrival referenced to the orientation of thespeakers 307. Thespatial processor 306 spatializes the directional source audio 403 in accordance with the output of theposition accumulator 405 and has an output of a spatialized binaural signal having the proper orientation, connected tospeakers 307. - According to an example, a personal speaker system may be oriented in a north facing direction. If a microphone array is oriented in an east facing direction and the direction of arrival of an audio signal is 45° off of the facing direction of the microphone array, the position accumulator receives a signal representative of each orientation, namely 0° for north, 90° for east and 45° for the direction of arrival for a total of 135° (90−0+45) for the orientation of the apparent audio source relative to the orientation of the speakers.
- In an example of an augmented reality system, if a game element is located northeast of a speaker position sensor and the orientation of the speaker is facing southeast of the spatialization applied to an audio signal associated with the game element is 45° (SE)−135° (NE)=−90°.
- According an advantageous feature, a motion detector such as Gyroscope, and/or a compass may be provided in connection with a microphone array. Because the microphone array is configured to be carried by a person, and because people move, a motion detector may be used to ascertain change in position and/or orientation of the microphone array.
- The techniques, processes and apparatus described may be utilized to control operation of any device and conserve use of resources based on conditions detected or applicable to the device.
- The invention is described in detail with respect to preferred embodiments, and it will now be apparent from the foregoing to those skilled in the art that changes and modifications may be made without departing from the invention in its broader aspects, and the invention, therefore, as defined in the claims, is intended to cover all such changes and modifications that fall within the true spirit of the invention.
- Thus, specific apparatus for and methods have been disclosed. It should be apparent, however, to those skilled in the art that many more modifications besides those already described are possible without departing from the inventive concepts herein. The inventive subject matter, therefore, is not to be restricted except in the spirit of the disclosure. Moreover, in interpreting the disclosure, all terms should be interpreted in the broadest possible manner consistent with the context. In particular, the terms “comprises” and “comprising” should be interpreted as referring to elements, components, or steps in a non-exclusive manner, indicating that the referenced elements, components, or steps may be present, or utilized, or combined with other elements, components, or steps that are not expressly referenced.
Claims (6)
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/355,766 US9980075B1 (en) | 2016-11-18 | 2016-11-18 | Audio source spatialization relative to orientation sensor and output |
US15/946,647 US11330388B2 (en) | 2016-11-18 | 2018-04-05 | Audio source spatialization relative to orientation sensor and output |
US17/719,285 US12262193B2 (en) | 2016-11-18 | 2022-04-12 | Audio source spatialization relative to orientation sensor and output |
US19/056,238 US20250193627A1 (en) | 2016-11-18 | 2025-02-18 | Audio source spatialization relative to orientation sensor and output |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/355,766 US9980075B1 (en) | 2016-11-18 | 2016-11-18 | Audio source spatialization relative to orientation sensor and output |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/946,647 Continuation US11330388B2 (en) | 2016-11-18 | 2018-04-05 | Audio source spatialization relative to orientation sensor and output |
Publications (2)
Publication Number | Publication Date |
---|---|
US9980075B1 US9980075B1 (en) | 2018-05-22 |
US20180146319A1 true US20180146319A1 (en) | 2018-05-24 |
Family
ID=62125518
Family Applications (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/355,766 Active US9980075B1 (en) | 2016-11-18 | 2016-11-18 | Audio source spatialization relative to orientation sensor and output |
US15/946,647 Active 2037-03-07 US11330388B2 (en) | 2016-11-18 | 2018-04-05 | Audio source spatialization relative to orientation sensor and output |
US17/719,285 Active US12262193B2 (en) | 2016-11-18 | 2022-04-12 | Audio source spatialization relative to orientation sensor and output |
US19/056,238 Pending US20250193627A1 (en) | 2016-11-18 | 2025-02-18 | Audio source spatialization relative to orientation sensor and output |
Family Applications After (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/946,647 Active 2037-03-07 US11330388B2 (en) | 2016-11-18 | 2018-04-05 | Audio source spatialization relative to orientation sensor and output |
US17/719,285 Active US12262193B2 (en) | 2016-11-18 | 2022-04-12 | Audio source spatialization relative to orientation sensor and output |
US19/056,238 Pending US20250193627A1 (en) | 2016-11-18 | 2025-02-18 | Audio source spatialization relative to orientation sensor and output |
Country Status (1)
Country | Link |
---|---|
US (4) | US9980075B1 (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180206038A1 (en) * | 2017-01-13 | 2018-07-19 | Bose Corporation | Real-time processing of audio data captured using a microphone array |
US20180376239A1 (en) * | 2017-06-27 | 2018-12-27 | Panasonic Intellectual Property Corporation Of America | Sound collecting apparatus |
US10277973B2 (en) * | 2017-03-31 | 2019-04-30 | Apple Inc. | Wireless ear bud system with pose detection |
US10386203B1 (en) * | 2015-11-05 | 2019-08-20 | Invensense, Inc. | Systems and methods for gyroscope calibration |
CN110225432A (en) * | 2019-05-10 | 2019-09-10 | 中国船舶重工集团公司第七一五研究所 | A kind of sonar target solid listens to method |
US11276397B2 (en) * | 2019-03-01 | 2022-03-15 | DSP Concepts, Inc. | Narrowband direction of arrival for full band beamformer |
US20220086592A1 (en) * | 2019-09-19 | 2022-03-17 | Wave Sciences, LLC | Spatial audio array processing system and method |
US12143806B2 (en) | 2019-09-19 | 2024-11-12 | Wave Sciences, LLC | Spatial audio array processing system and method |
Families Citing this family (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107290711A (en) * | 2016-03-30 | 2017-10-24 | 芋头科技(杭州)有限公司 | A kind of voice is sought to system and method |
US10477304B2 (en) | 2016-06-15 | 2019-11-12 | Mh Acoustics, Llc | Spatial encoding directional microphone array |
US10356514B2 (en) | 2016-06-15 | 2019-07-16 | Mh Acoustics, Llc | Spatial encoding directional microphone array |
US10297267B2 (en) * | 2017-05-15 | 2019-05-21 | Cirrus Logic, Inc. | Dual microphone voice processing for headsets with variable microphone array orientation |
US10009690B1 (en) * | 2017-12-08 | 2018-06-26 | Glen A. Norris | Dummy head for electronic calls |
US10674244B2 (en) * | 2018-02-21 | 2020-06-02 | Bose Corporation | Audio device |
US20190324117A1 (en) * | 2018-04-24 | 2019-10-24 | Mediatek Inc. | Content aware audio source localization |
GB201814988D0 (en) * | 2018-09-14 | 2018-10-31 | Squarehead Tech As | Microphone Arrays |
EP3668110B1 (en) * | 2018-12-12 | 2023-10-11 | FalCom A/S | Communication device with position-dependent spatial source generation, communication system, and related method |
EP3668123B1 (en) | 2018-12-13 | 2024-07-17 | GN Audio A/S | Hearing device providing virtual sound |
US10575094B1 (en) * | 2018-12-13 | 2020-02-25 | Dts, Inc. | Combination of immersive and binaural sound |
WO2020231888A1 (en) | 2019-05-10 | 2020-11-19 | BadVR, Inc. | Novel systems and methods for collecting, locating and visualizing sensor signals in extended reality |
WO2021018378A1 (en) * | 2019-07-29 | 2021-02-04 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus, method or computer program for processing a sound field representation in a spatial transform domain |
JP7392969B2 (en) * | 2019-08-19 | 2023-12-06 | 株式会社オーディオテクニカ | Microphone position determination method |
DE102019005855B4 (en) * | 2019-08-20 | 2021-03-18 | Christoph Kemper | Method for adapting a sound converter to a reference sound converter |
US10924838B1 (en) * | 2019-09-11 | 2021-02-16 | Bose Corporation | Audio device |
US11638111B2 (en) * | 2019-11-01 | 2023-04-25 | Meta Platforms Technologies, Llc | Systems and methods for classifying beamformed signals for binaural audio playback |
WO2021193734A1 (en) * | 2020-03-26 | 2021-09-30 | 日本電気株式会社 | Sound processing device, sound processing system, sound processing method, and non-transitory computer-readable medium having program stored therein |
WO2021243634A1 (en) * | 2020-06-04 | 2021-12-09 | Northwestern Polytechnical University | Binaural beamforming microphone array |
US11363383B2 (en) * | 2020-09-01 | 2022-06-14 | Logitech Europe S.A. | Dynamic adjustment of earbud performance characteristics |
US11617044B2 (en) | 2021-03-04 | 2023-03-28 | Iyo Inc. | Ear-mount able listening device with voice direction discovery for rotational correction of microphone array outputs |
US11140469B1 (en) | 2021-05-03 | 2021-10-05 | Bose Corporation | Open-ear headphone |
US12010483B2 (en) * | 2021-08-06 | 2024-06-11 | Qsc, Llc | Acoustic microphone arrays |
US12323785B2 (en) | 2023-03-31 | 2025-06-03 | Iyo Inc. | Virtual auditory display filters and associated systems, methods, and non-transitory computer-readable media |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120288126A1 (en) * | 2009-11-30 | 2012-11-15 | Nokia Corporation | Apparatus |
US20140036127A1 (en) * | 2012-08-02 | 2014-02-06 | Ronald Pong | Headphones with interactive display |
US8718930B2 (en) * | 2012-08-24 | 2014-05-06 | Sony Corporation | Acoustic navigation method |
US8981994B2 (en) * | 2011-09-30 | 2015-03-17 | Skype | Processing signals |
US9826297B2 (en) * | 2014-10-29 | 2017-11-21 | At&T Intellectual Property I, L.P. | Accessory device that provides sensor input to a media device |
Family Cites Families (183)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3806919A (en) | 1971-03-15 | 1974-04-23 | Lumatron Corp | Light organ |
US4776044A (en) | 1987-07-30 | 1988-10-11 | Makins J Patrick | Hat with audio earphones |
USRE38405E1 (en) | 1992-07-30 | 2004-01-27 | Clair Bros. Audio Enterprises, Inc. | Enhanced concert audio system |
CA2139866A1 (en) | 1992-07-30 | 1994-02-17 | Roy B. Clair, Jr. | Concert audio system |
US5581620A (en) | 1994-04-21 | 1996-12-03 | Brown University Research Foundation | Methods and apparatus for adaptive beamforming |
US5737431A (en) | 1995-03-07 | 1998-04-07 | Brown University Research Foundation | Methods and apparatus for source location estimation from microphone-array time-delay estimates |
JPH08279004A (en) | 1995-04-04 | 1996-10-22 | Fujitsu Ltd | Facility guidance system control method and facility guidance system |
US5764778A (en) | 1995-06-07 | 1998-06-09 | Sensimetrics Corporation | Hearing aid headset having an array of microphones |
US5638343A (en) | 1995-07-13 | 1997-06-10 | Sony Corporation | Method and apparatus for re-recording multi-track sound recordings for dual-channel playbacK |
US5619582A (en) | 1996-01-16 | 1997-04-08 | Oltman; Randy | Enhanced concert audio process utilizing a synchronized headgear system |
US5793875A (en) | 1996-04-22 | 1998-08-11 | Cardinal Sound Labs, Inc. | Directional hearing system |
US5778082A (en) | 1996-06-14 | 1998-07-07 | Picturetel Corporation | Method and apparatus for localization of an acoustic source |
AUPO099696A0 (en) * | 1996-07-12 | 1996-08-08 | Lake Dsp Pty Limited | Methods and apparatus for processing spatialised audio |
US5912976A (en) | 1996-11-07 | 1999-06-15 | Srs Labs, Inc. | Multi-channel audio enhancement system for use in recording and playback and methods for providing same |
US6176837B1 (en) | 1998-04-17 | 2001-01-23 | Massachusetts Institute Of Technology | Motion tracking system |
IL127790A (en) | 1998-04-21 | 2003-02-12 | Ibm | System and method for selecting, accessing and viewing portions of an information stream(s) using a television companion device |
IL135281A (en) | 2000-03-27 | 2004-05-12 | Phone Or Ltd | Small optical microphone/sensor |
US7110552B1 (en) | 2000-11-20 | 2006-09-19 | Front Row Adv | Personal listening device for arena events |
JP2004518383A (en) | 2001-01-29 | 2004-06-17 | シーメンス アクチエンゲゼルシヤフト | Electroacoustic conversion of audio signals, especially audio signals |
JP3700931B2 (en) | 2001-06-11 | 2005-09-28 | ヤマハ株式会社 | Multitrack digital recording and playback device |
US7349547B1 (en) | 2001-11-20 | 2008-03-25 | Plantronics, Inc. | Noise masking communications apparatus |
US6674687B2 (en) | 2002-01-25 | 2004-01-06 | Navcom Technology, Inc. | System and method for navigation using two-way ultrasonic positioning |
US20030161485A1 (en) | 2002-02-27 | 2003-08-28 | Shure Incorporated | Multiple beam automatic mixing microphone array processing via speech detection |
US6816437B1 (en) | 2002-06-03 | 2004-11-09 | Massachusetts Institute Of Technology | Method and apparatus for determining orientation |
AU2002300314B2 (en) | 2002-07-29 | 2009-01-22 | Hearworks Pty. Ltd. | Apparatus And Method For Frequency Transposition In Hearing Aids |
AU2003260875A1 (en) | 2002-09-23 | 2004-04-08 | Koninklijke Philips Electronics N.V. | Sound reproduction system, program and data carrier |
US7430300B2 (en) | 2002-11-18 | 2008-09-30 | Digisenz Llc | Sound production systems and methods for providing sound inside a headgear unit |
FR2852779B1 (en) | 2003-03-20 | 2008-08-01 | PROCESS FOR PROCESSING AN ELECTRICAL SIGNAL OF SOUND | |
US6959075B2 (en) | 2003-03-24 | 2005-10-25 | Cisco Technology, Inc. | Replay of conference audio |
US8001187B2 (en) | 2003-07-01 | 2011-08-16 | Apple Inc. | Peer-to-peer active content sharing |
AU2003236382B2 (en) | 2003-08-20 | 2011-02-24 | Phonak Ag | Feedback suppression in sound signal processing using frequency transposition |
WO2005029318A2 (en) | 2003-09-19 | 2005-03-31 | University Of Delaware | Methods and products for processing loop nests |
DE10345190A1 (en) | 2003-09-29 | 2005-04-21 | Thomson Brandt Gmbh | Method and arrangement for spatially constant location of hearing events by means of headphones |
US7475434B2 (en) | 2003-12-05 | 2009-01-13 | K-2 Corporation | Helmet with in-mold and post-applied hard shell |
US7415117B2 (en) | 2004-03-02 | 2008-08-19 | Microsoft Corporation | System and method for beamforming using a microphone array |
DE102004025533A1 (en) | 2004-05-25 | 2005-12-29 | Sennheiser Electronic Gmbh & Co. Kg | System for rendering audio-surround signals has signal source for allocation of signals, signal processing device for processing and separation of signals in main audio channel and surround channel, head phone and speaker |
US7620409B2 (en) | 2004-06-17 | 2009-11-17 | Honeywell International Inc. | Wireless communication system with channel hopping and redundant connectivity |
US20060013409A1 (en) | 2004-07-16 | 2006-01-19 | Sensimetrics Corporation | Microphone-array processing to generate directional cues in an audio signal |
EP1795041A4 (en) | 2004-09-07 | 2009-08-12 | Sensear Pty Ltd | Apparatus and method for sound enhancement |
US8170879B2 (en) | 2004-10-26 | 2012-05-01 | Qnx Software Systems Limited | Periodic signal enhancement system |
US7302468B2 (en) | 2004-11-01 | 2007-11-27 | Motorola Inc. | Local area preference determination system and method |
US7817805B1 (en) | 2005-01-12 | 2010-10-19 | Motion Computing, Inc. | System and method for steering the directional response of a microphone to a moving acoustic source |
US7583808B2 (en) | 2005-03-28 | 2009-09-01 | Mitsubishi Electric Research Laboratories, Inc. | Locating and tracking acoustic sources with microphone arrays |
US7970150B2 (en) | 2005-04-29 | 2011-06-28 | Lifesize Communications, Inc. | Tracking talkers using virtual broadside scan and directed beams |
US20090316529A1 (en) | 2005-05-12 | 2009-12-24 | Nokia Corporation | Positioning of a Portable Electronic Device |
FR2886503B1 (en) | 2005-05-27 | 2007-08-24 | Arkamys Sa | METHOD FOR PRODUCING MORE THAN TWO SEPARATE TEMPORAL ELECTRIC SIGNALS FROM A FIRST AND A SECOND TIME ELECTRICAL SIGNAL |
US7720462B2 (en) | 2005-07-21 | 2010-05-18 | Cisco Technology, Inc. | Network communications security enhancing |
WO2007019530A2 (en) | 2005-08-04 | 2007-02-15 | R2Di, Llc | Multi-channel wireless digital audio distribution system and methods |
US8566887B2 (en) | 2005-12-09 | 2013-10-22 | Time Warner Cable Enterprises Llc | Caption data delivery apparatus and methods |
US7848512B2 (en) | 2006-03-27 | 2010-12-07 | Kurt Eldracher | Personal audio device accessory |
USD552077S1 (en) | 2006-06-13 | 2007-10-02 | Robert Brunner | Headphone |
US8194873B2 (en) | 2006-06-26 | 2012-06-05 | Davis Pan | Active noise reduction adaptive filter leakage adjusting |
NO328582B1 (en) | 2006-12-29 | 2010-03-22 | Tandberg Telecom As | Microphone for audio source tracking |
JP5065687B2 (en) | 2007-01-09 | 2012-11-07 | 株式会社東芝 | Audio data processing device and terminal device |
US20090017910A1 (en) | 2007-06-22 | 2009-01-15 | Broadcom Corporation | Position and motion tracking of an object |
US7995770B1 (en) | 2007-02-02 | 2011-08-09 | Jeffrey Franklin Simon | Apparatus and method for aligning and controlling reception of sound transmissions at locations distant from the sound source |
JP4799443B2 (en) | 2007-02-21 | 2011-10-26 | 株式会社東芝 | Sound receiving device and method |
FR2918532B1 (en) | 2007-07-05 | 2015-04-24 | Arkamys | METHOD FOR THE SOUND PROCESSING OF A STEREO PHONE SIGNAL INSIDE A MOTOR VEHICLE AND A MOTOR VEHICLE USING THE SAME |
DE102007031677B4 (en) | 2007-07-06 | 2010-05-20 | Sda Software Design Ahnert Gmbh | Method and apparatus for determining a room acoustic impulse response in the time domain |
EP2202531A4 (en) | 2007-10-01 | 2012-12-26 | Panasonic Corp | SOUND SOURCE DIRECTION DETECTOR |
DK2065726T3 (en) | 2007-11-13 | 2012-05-21 | Uni I Oslo | High capacity ultrasonic zone localization system |
US8150054B2 (en) | 2007-12-11 | 2012-04-03 | Andrea Electronics Corporation | Adaptive filter in a sensor array system |
US20090188641A1 (en) | 2008-01-30 | 2009-07-30 | Andritz Inc. | Method and system for measuring and controlling digester or impregnation vessel chip level by measuring chip pressure |
JP4983630B2 (en) | 2008-02-05 | 2012-07-25 | ヤマハ株式会社 | Sound emission and collection device |
EP2272258A1 (en) | 2008-04-02 | 2011-01-12 | Jason Regler | Audio or audio/visual interactive entertainment system and switching device therefor |
WO2009132270A1 (en) | 2008-04-25 | 2009-10-29 | Andrea Electronics Corporation | Headset with integrated stereo array microphone |
US8989882B2 (en) | 2008-08-06 | 2015-03-24 | At&T Intellectual Property I, L.P. | Method and apparatus for managing presentation of media content |
US20100048134A1 (en) | 2008-08-19 | 2010-02-25 | Mccarthy Randall T | Wireless communication system and communication method with wireless headset |
US20120177219A1 (en) | 2008-10-06 | 2012-07-12 | Bbn Technologies Corp. | Wearable shooter localization system |
US7782610B2 (en) | 2008-11-17 | 2010-08-24 | Incase Designs Corp. | Portable electronic device case with battery |
US8150063B2 (en) | 2008-11-25 | 2012-04-03 | Apple Inc. | Stabilizing directional audio input from a moving microphone array |
EP3089558A3 (en) | 2008-11-26 | 2017-01-18 | Wireless Environment, LLC | Wireless lighting devices and applications |
US20100205222A1 (en) | 2009-02-10 | 2010-08-12 | Tom Gajdos | Music profiling |
FR2942096B1 (en) | 2009-02-11 | 2016-09-02 | Arkamys | METHOD FOR POSITIONING A SOUND OBJECT IN A 3D SOUND ENVIRONMENT, AUDIO MEDIUM IMPLEMENTING THE METHOD, AND ASSOCIATED TEST PLATFORM |
US10616619B2 (en) | 2009-03-03 | 2020-04-07 | Mobilitie, Llc | System and method for multi-channel WiFi video streaming |
US9986268B2 (en) | 2009-03-03 | 2018-05-29 | Mobilitie, Llc | System and method for multi-channel WiFi video streaming |
US8335318B2 (en) | 2009-03-20 | 2012-12-18 | Bose Corporation | Active noise reduction adaptive filtering |
US8396196B2 (en) | 2009-05-08 | 2013-03-12 | Apple Inc. | Transfer of multiple microphone signals to an audio host device |
US8160265B2 (en) | 2009-05-18 | 2012-04-17 | Sony Computer Entertainment Inc. | Method and apparatus for enhancing the generation of three-dimensional sound in headphone devices |
US8314354B2 (en) | 2009-07-27 | 2012-11-20 | Apple Inc. | Accessory controller for electronic devices |
GB2473267A (en) | 2009-09-07 | 2011-03-09 | Nokia Corp | Processing audio signals to reduce noise |
KR101387195B1 (en) | 2009-10-05 | 2014-04-21 | 하만인터내셔날인더스트리스인코포레이티드 | System for spatial extraction of audio signals |
US8509453B2 (en) | 2009-10-29 | 2013-08-13 | Google Inc. | Luminescent headphones without battery packs |
CN102860039B (en) | 2009-11-12 | 2016-10-19 | 罗伯特·亨利·弗莱特 | Speakerphone and/or microphone arrays and methods and systems using the same |
AU2010319282A1 (en) | 2009-11-16 | 2012-05-31 | Emory University | Lattice-mismatched core-shell quantum dots |
US8428286B2 (en) | 2009-11-30 | 2013-04-23 | Infineon Technologies Ag | MEMS microphone packaging and MEMS microphone module |
CH702399B1 (en) | 2009-12-02 | 2018-05-15 | Veovox Sa | Apparatus and method for capturing and processing the voice |
FR2954570B1 (en) | 2009-12-23 | 2012-06-08 | Arkamys | METHOD FOR ENCODING / DECODING AN IMPROVED STEREO DIGITAL STREAM AND ASSOCIATED ENCODING / DECODING DEVICE |
FR2954640B1 (en) | 2009-12-23 | 2012-01-20 | Arkamys | METHOD FOR OPTIMIZING STEREO RECEPTION FOR ANALOG RADIO AND ANALOG RADIO RECEIVER |
US8521316B2 (en) | 2010-03-31 | 2013-08-27 | Apple Inc. | Coordinated group musical experience |
FR2958825B1 (en) | 2010-04-12 | 2016-04-01 | Arkamys | METHOD OF SELECTING PERFECTLY OPTIMUM HRTF FILTERS IN A DATABASE FROM MORPHOLOGICAL PARAMETERS |
US8761421B2 (en) | 2011-01-14 | 2014-06-24 | Audiotoniq, Inc. | Portable electronic device and computer-readable medium for remote hearing aid profile storage |
US8866495B2 (en) | 2010-06-30 | 2014-10-21 | Access Business Group International Llc | Spatial tracking system and method |
US9491560B2 (en) | 2010-07-20 | 2016-11-08 | Analog Devices, Inc. | System and method for improving headphone spatial impression |
US9025782B2 (en) | 2010-07-26 | 2015-05-05 | Qualcomm Incorporated | Systems, methods, apparatus, and computer-readable media for multi-microphone location-selective processing |
USD641725S1 (en) | 2010-08-02 | 2011-07-19 | Creative Technology Ltd | Headphones |
KR101782050B1 (en) | 2010-09-17 | 2017-09-28 | 삼성전자주식회사 | Apparatus and method for enhancing audio quality using non-uniform configuration of microphones |
US8861756B2 (en) | 2010-09-24 | 2014-10-14 | LI Creative Technologies, Inc. | Microphone array system |
WO2012048299A1 (en) | 2010-10-07 | 2012-04-12 | Clair Brothers Audio Enterprises, Inc. | Method and system for enhancing sound |
DE112011103447T5 (en) | 2010-10-15 | 2013-08-22 | Intelligent Mechatronic Systems, Inc. | Man-machine interaction controlled by implicit assignment and polymorphism |
US9031256B2 (en) | 2010-10-25 | 2015-05-12 | Qualcomm Incorporated | Systems, methods, apparatus, and computer-readable media for orientation-sensitive recording control |
US9552840B2 (en) | 2010-10-25 | 2017-01-24 | Qualcomm Incorporated | Three-dimensional sound capturing and reproducing with multi-microphones |
WO2012063103A1 (en) * | 2010-11-12 | 2012-05-18 | Nokia Corporation | An Audio Processing Apparatus |
US8525868B2 (en) | 2011-01-13 | 2013-09-03 | Qualcomm Incorporated | Variable beamforming with a mobile platform |
US9037458B2 (en) | 2011-02-23 | 2015-05-19 | Qualcomm Incorporated | Systems, methods, apparatus, and computer-readable media for spatially selective audio augmentation |
US8929564B2 (en) | 2011-03-03 | 2015-01-06 | Microsoft Corporation | Noise adaptive beamforming for microphone arrays |
EP2509337B1 (en) | 2011-04-06 | 2014-09-24 | Sony Ericsson Mobile Communications AB | Accelerometer vector controlled noise cancelling method |
GB201105902D0 (en) | 2011-04-07 | 2011-05-18 | Sonitor Technologies As | Location system |
CA2833544A1 (en) | 2011-04-18 | 2012-10-26 | Eyesee360, Inc. | Apparatus and method for panoramic video imaging with mobile computing devices |
US9226088B2 (en) | 2011-06-11 | 2015-12-29 | Clearone Communications, Inc. | Methods and apparatuses for multiple configurations of beamforming microphone arrays |
FR2976748B1 (en) | 2011-06-17 | 2013-12-27 | Arkamys | METHOD FOR STANDARDIZING THE POWER OF A SOUND SIGNAL AND ASSOCIATED PROCESSING DEVICE |
US20130030789A1 (en) | 2011-07-29 | 2013-01-31 | Reginald Dalce | Universal Language Translator |
GB201113805D0 (en) | 2011-08-11 | 2011-09-21 | Rb Concepts Ltd | Interactive lighting effect and wristband |
US8949958B1 (en) | 2011-08-25 | 2015-02-03 | Amazon Technologies, Inc. | Authentication using media fingerprinting |
US8515751B2 (en) | 2011-09-28 | 2013-08-20 | Google Inc. | Selective feedback for text recognition systems |
US9326064B2 (en) | 2011-10-09 | 2016-04-26 | VisiSonics Corporation | Microphone array configuration and method for operating the same |
US9402117B2 (en) | 2011-10-19 | 2016-07-26 | Wave Sciences, LLC | Wearable directional microphone array apparatus and system |
KR20130048075A (en) | 2011-11-01 | 2013-05-09 | 삼성전자주식회사 | Apparatus for multiple sound source localization and method the same |
FR2982404B1 (en) | 2011-11-07 | 2014-01-03 | Arkamys | METHOD FOR REDUCING PARASITIC VIBRATIONS OF A SPEAKER ENVIRONMENT FOR PRESERVING PERCEPTION OF THE LOW FREQUENCIES OF THE SIGNAL TO BE DISTRIBUTED AND ASSOCIATED PROCESSING DEVICE |
US9143595B1 (en) | 2011-11-29 | 2015-09-22 | Ryan Michael Dowd | Multi-listener headphone system with luminescent light emissions dependent upon selected channels |
US20130148814A1 (en) | 2011-12-10 | 2013-06-13 | Stmicroelectronics Asia Pacific Pte Ltd | Audio acquisition systems and methods |
WO2013147791A1 (en) * | 2012-03-29 | 2013-10-03 | Intel Corporation | Audio control based on orientation |
US20130322214A1 (en) | 2012-05-29 | 2013-12-05 | Corning Cable Systems Llc | Ultrasound-based localization of client devices in distributed communication systems, and related devices, systems, and methods |
WO2013186593A1 (en) | 2012-06-14 | 2013-12-19 | Nokia Corporation | Audio capture apparatus |
US9137281B2 (en) | 2012-06-22 | 2015-09-15 | Guest Tek Interactive Entertainment Ltd. | Dynamically enabling guest device supporting network-based media sharing protocol to share media content over local area computer network of lodging establishment with subset of in-room media devices connected thereto |
WO2014022359A2 (en) | 2012-07-30 | 2014-02-06 | Personics Holdings, Inc. | Automatic sound pass-through method and system for earphones |
US9516407B2 (en) | 2012-08-13 | 2016-12-06 | Apple Inc. | Active noise control with compensation for error sensing at the eardrum |
US9313572B2 (en) | 2012-09-28 | 2016-04-12 | Apple Inc. | System and method of detecting a user's voice activity using an accelerometer |
EP2759147A1 (en) | 2012-10-02 | 2014-07-30 | MH Acoustics, LLC | Earphones having configurable microphone arrays |
US9132342B2 (en) | 2012-10-31 | 2015-09-15 | Sulon Technologies Inc. | Dynamic environment and location based augmented reality (AR) systems |
GB2509157A (en) | 2012-12-21 | 2014-06-25 | Crowd Connected Ltd | Forming an image using plural pixel devices and determining the position of a plurality of mobile devices |
JP6228539B2 (en) | 2012-12-28 | 2017-11-08 | 株式会社スポットライト | Ultrasonic communication system |
JP6089706B2 (en) | 2013-01-07 | 2017-03-08 | 富士通株式会社 | Transmission signal power control apparatus, communication apparatus, and predistortion coefficient update method |
US20140200054A1 (en) | 2013-01-14 | 2014-07-17 | Fraden Corp. | Sensing case for a mobile communication device |
US20140233181A1 (en) | 2013-02-21 | 2014-08-21 | Donn K. Harms | Protective Case Device with Interchangeable Faceplate System |
US9462379B2 (en) | 2013-03-12 | 2016-10-04 | Google Technology Holdings LLC | Method and apparatus for detecting and controlling the orientation of a virtual microphone |
US9351091B2 (en) | 2013-03-12 | 2016-05-24 | Google Technology Holdings LLC | Apparatus with adaptive microphone configuration based on surface proximity, surface type and motion |
US10229697B2 (en) | 2013-03-12 | 2019-03-12 | Google Technology Holdings LLC | Apparatus and method for beamforming to obtain voice and noise signals |
US8934654B2 (en) | 2013-03-13 | 2015-01-13 | Aliphcom | Non-occluded personal audio and communication system |
US9270244B2 (en) | 2013-03-13 | 2016-02-23 | Personics Holdings, Llc | System and method to detect close voice sources and automatically enhance situation awareness |
US9699553B2 (en) | 2013-03-15 | 2017-07-04 | Skullcandy, Inc. | Customizing audio reproduction devices |
US9363596B2 (en) | 2013-03-15 | 2016-06-07 | Apple Inc. | System and method of mixing accelerometer and microphone signals to improve voice quality in a mobile device |
JP6056625B2 (en) | 2013-04-12 | 2017-01-11 | 富士通株式会社 | Information processing apparatus, voice processing method, and voice processing program |
US9621974B2 (en) | 2013-05-20 | 2017-04-11 | Rajkumari Mohindra | Dual purpose pill reminder and tamper detector |
US9984675B2 (en) | 2013-05-24 | 2018-05-29 | Google Technology Holdings LLC | Voice controlled audio recording system with adjustable beamforming |
US20140359444A1 (en) | 2013-05-31 | 2014-12-04 | Escape Media Group, Inc. | Streaming live broadcast media |
US10262462B2 (en) * | 2014-04-18 | 2019-04-16 | Magic Leap, Inc. | Systems and methods for augmented and virtual reality |
US9065718B2 (en) | 2013-08-03 | 2015-06-23 | Gregory Hubert Piesinger | COFDM using pseudo orthogonal QPSK coding |
US9451162B2 (en) | 2013-08-21 | 2016-09-20 | Jaunt Inc. | Camera array including camera modules |
US9591404B1 (en) | 2013-09-27 | 2017-03-07 | Amazon Technologies, Inc. | Beamformer design using constrained convex optimization in three-dimensional space |
US9286897B2 (en) | 2013-09-27 | 2016-03-15 | Amazon Technologies, Inc. | Speech recognizer with multi-directional decoding |
US10382864B2 (en) | 2013-12-10 | 2019-08-13 | Cirrus Logic, Inc. | Systems and methods for providing adaptive playback equalization in an audio device |
US9467972B2 (en) | 2013-12-30 | 2016-10-11 | Motorola Solutions, Inc. | Multicast wireless communication system |
US8767996B1 (en) | 2014-01-06 | 2014-07-01 | Alpine Electronics of Silicon Valley, Inc. | Methods and devices for reproducing audio signals with a haptic apparatus on acoustic headphones |
US9087506B1 (en) | 2014-01-21 | 2015-07-21 | Doppler Labs, Inc. | Passive acoustical filters incorporating inserts that reduce the speed of sound |
US9552359B2 (en) | 2014-02-21 | 2017-01-24 | Apple Inc. | Revisiting content history |
US9557960B2 (en) | 2014-04-08 | 2017-01-31 | Doppler Labs, Inc. | Active acoustic filter with automatic selection of filter parameters based on ambient sound |
US9524731B2 (en) | 2014-04-08 | 2016-12-20 | Doppler Labs, Inc. | Active acoustic filter with location-based filter characteristics |
US9648436B2 (en) | 2014-04-08 | 2017-05-09 | Doppler Labs, Inc. | Augmented reality sound system |
US9560437B2 (en) | 2014-04-08 | 2017-01-31 | Doppler Labs, Inc. | Time heuristic audio control |
US9825598B2 (en) | 2014-04-08 | 2017-11-21 | Doppler Labs, Inc. | Real-time combination of ambient audio and a secondary audio source |
US9953492B2 (en) | 2014-04-18 | 2018-04-24 | Siemens Schweiz Ag | Configurable macro button for voice system activation by alarm system operator |
WO2015164287A1 (en) | 2014-04-21 | 2015-10-29 | Uqmartyne Management Llc | Wireless earphone |
US9911454B2 (en) | 2014-05-29 | 2018-03-06 | Jaunt Inc. | Camera array including camera modules |
US9992569B2 (en) | 2014-05-30 | 2018-06-05 | Paul D. Terpstra | Camera-mountable acoustic collection assembly |
US9904851B2 (en) | 2014-06-11 | 2018-02-27 | At&T Intellectual Property I, L.P. | Exploiting visual information for enhancing audio signals via source separation and beamforming |
US20150382096A1 (en) | 2014-06-25 | 2015-12-31 | Roam, Llc | Headphones with pendant audio processing |
US9432769B1 (en) | 2014-07-30 | 2016-08-30 | Amazon Technologies, Inc. | Method and system for beam selection in microphone array beamformers |
KR20160045353A (en) | 2014-10-17 | 2016-04-27 | 현대자동차주식회사 | Audio video navigation, vehicle and controlling method of the audio video navigation |
US9508335B2 (en) | 2014-12-05 | 2016-11-29 | Stages Pcs, Llc | Active noise control and customized audio system |
US20170374455A1 (en) * | 2015-01-20 | 2017-12-28 | 3M Innovative Properties Company | Mountable sound capture and reproduction device for determining acoustic signal origin |
WO2016118656A1 (en) * | 2015-01-21 | 2016-07-28 | Harman International Industries, Incorporated | Techniques for amplifying sound based on directions of interest |
KR101648840B1 (en) | 2015-02-16 | 2016-08-30 | 포항공과대학교 산학협력단 | Hearing-aids attached to mobile electronic device |
US9838782B2 (en) | 2015-03-30 | 2017-12-05 | Bose Corporation | Adaptive mixing of sub-band signals |
EP3304927A4 (en) | 2015-06-03 | 2018-07-18 | Razer (Asia-Pacific) Pte. Ltd. | Headset devices and methods for controlling a headset device |
JP6652978B2 (en) | 2015-06-26 | 2020-02-26 | ハーマン インターナショナル インダストリーズ インコーポレイテッド | Sports headphones with situational awareness |
US10257637B2 (en) * | 2015-06-30 | 2019-04-09 | Harman International Industries, Incorporated | Shoulder-mounted robotic speakers |
GB2543275A (en) * | 2015-10-12 | 2017-04-19 | Nokia Technologies Oy | Distributed audio capture and mixing |
GB2540199A (en) * | 2015-07-09 | 2017-01-11 | Nokia Technologies Oy | An apparatus, method and computer program for providing sound reproduction |
US10206035B2 (en) | 2015-08-31 | 2019-02-12 | University Of Maryland | Simultaneous solution for sparsity and filter responses for a microphone network |
WO2017174136A1 (en) | 2016-04-07 | 2017-10-12 | Sonova Ag | Hearing assistance system |
US10979843B2 (en) * | 2016-04-08 | 2021-04-13 | Qualcomm Incorporated | Spatialized audio output based on predicted position data |
US10089063B2 (en) * | 2016-08-10 | 2018-10-02 | Qualcomm Incorporated | Multimedia device for processing spatialized audio based on movement |
-
2016
- 2016-11-18 US US15/355,766 patent/US9980075B1/en active Active
-
2018
- 2018-04-05 US US15/946,647 patent/US11330388B2/en active Active
-
2022
- 2022-04-12 US US17/719,285 patent/US12262193B2/en active Active
-
2025
- 2025-02-18 US US19/056,238 patent/US20250193627A1/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120288126A1 (en) * | 2009-11-30 | 2012-11-15 | Nokia Corporation | Apparatus |
US8981994B2 (en) * | 2011-09-30 | 2015-03-17 | Skype | Processing signals |
US20140036127A1 (en) * | 2012-08-02 | 2014-02-06 | Ronald Pong | Headphones with interactive display |
US8718930B2 (en) * | 2012-08-24 | 2014-05-06 | Sony Corporation | Acoustic navigation method |
US9826297B2 (en) * | 2014-10-29 | 2017-11-21 | At&T Intellectual Property I, L.P. | Accessory device that provides sensor input to a media device |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10386203B1 (en) * | 2015-11-05 | 2019-08-20 | Invensense, Inc. | Systems and methods for gyroscope calibration |
US20180206038A1 (en) * | 2017-01-13 | 2018-07-19 | Bose Corporation | Real-time processing of audio data captured using a microphone array |
US11601743B2 (en) | 2017-03-31 | 2023-03-07 | Apple Inc. | Wireless ear bud system with pose detection |
US10277973B2 (en) * | 2017-03-31 | 2019-04-30 | Apple Inc. | Wireless ear bud system with pose detection |
US10715902B2 (en) | 2017-03-31 | 2020-07-14 | Apple Inc. | Wireless ear bud system with pose detection |
US12294825B2 (en) | 2017-03-31 | 2025-05-06 | Apple Inc. | Wireless ear bud system with pose detection |
US10356508B2 (en) * | 2017-06-27 | 2019-07-16 | Panasonic Intellectual Property Corporation Of America | Sound collecting apparatus |
US20180376239A1 (en) * | 2017-06-27 | 2018-12-27 | Panasonic Intellectual Property Corporation Of America | Sound collecting apparatus |
US11276397B2 (en) * | 2019-03-01 | 2022-03-15 | DSP Concepts, Inc. | Narrowband direction of arrival for full band beamformer |
US11651772B2 (en) | 2019-03-01 | 2023-05-16 | DSP Concepts, Inc. | Narrowband direction of arrival for full band beamformer |
CN110225432A (en) * | 2019-05-10 | 2019-09-10 | 中国船舶重工集团公司第七一五研究所 | A kind of sonar target solid listens to method |
US20220086592A1 (en) * | 2019-09-19 | 2022-03-17 | Wave Sciences, LLC | Spatial audio array processing system and method |
US11997474B2 (en) * | 2019-09-19 | 2024-05-28 | Wave Sciences, LLC | Spatial audio array processing system and method |
US12143806B2 (en) | 2019-09-19 | 2024-11-12 | Wave Sciences, LLC | Spatial audio array processing system and method |
Also Published As
Publication number | Publication date |
---|---|
US20190116444A1 (en) | 2019-04-18 |
US20250193627A1 (en) | 2025-06-12 |
US20220240045A1 (en) | 2022-07-28 |
US9980075B1 (en) | 2018-05-22 |
US12262193B2 (en) | 2025-03-25 |
US11330388B2 (en) | 2022-05-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US12262193B2 (en) | Audio source spatialization relative to orientation sensor and output | |
US11601764B2 (en) | Audio analysis and processing system | |
US9980042B1 (en) | Beamformer direction of arrival and orientation analysis system | |
US9774970B2 (en) | Multi-channel multi-domain source identification and tracking | |
US10397722B2 (en) | Distributed audio capture and mixing | |
US20160165350A1 (en) | Audio source spatialization | |
US9706292B2 (en) | Audio camera using microphone arrays for real time capture of audio images and method for jointly processing the audio images with video images | |
US20160165341A1 (en) | Portable microphone array | |
US8744101B1 (en) | System for controlling the primary lobe of a hearing instrument's directional sensitivity pattern | |
US20160165338A1 (en) | Directional audio recording system | |
US20160161595A1 (en) | Narrowcast messaging system | |
US20160161589A1 (en) | Audio source imaging system | |
US20160161594A1 (en) | Swarm mapping system | |
JP2008543143A (en) | Acoustic transducer assembly, system and method | |
WO2021064468A1 (en) | Sound source localization with co-located sensor elements | |
JP2017092732A (en) | Auditory supporting system and auditory supporting device | |
US20160161588A1 (en) | Body-mounted multi-planar array | |
US20160192066A1 (en) | Outerwear-mounted multi-directional sensor | |
JP2008543144A (en) | Acoustic signal apparatus, system, and method | |
US20180146285A1 (en) | Audio Gateway System | |
US20080267422A1 (en) | Microphone Array and Digital Signal Processing System | |
US20160165339A1 (en) | Microphone array and audio source tracking system | |
US20160165342A1 (en) | Helmet-mounted multi-directional sensor | |
CN115967883A (en) | Headphone, user equipment and method for processing signals | |
Hur et al. | Techniques for synthetic reconfiguration of microphone arrays |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: STAGES PCS, LLC, NEW JERSEY Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BENATTAR, BENJAMIN, MR.;REEL/FRAME:040371/0477 Effective date: 20161118 |
|
AS | Assignment |
Owner name: STAGES LLC, NEW JERSEY Free format text: CHANGE OF NAME;ASSIGNOR:STAGES PCS, LLC;REEL/FRAME:040773/0601 Effective date: 20160630 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YR, SMALL ENTITY (ORIGINAL EVENT CODE: M2551); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY Year of fee payment: 4 |