EP1388145B1 - Vorrichtung und verfahren zum analysieren eines audiosignals hinsichtlich von rhythmusinformationen - Google Patents
Vorrichtung und verfahren zum analysieren eines audiosignals hinsichtlich von rhythmusinformationen Download PDFInfo
- Publication number
- EP1388145B1 EP1388145B1 EP02745267A EP02745267A EP1388145B1 EP 1388145 B1 EP1388145 B1 EP 1388145B1 EP 02745267 A EP02745267 A EP 02745267A EP 02745267 A EP02745267 A EP 02745267A EP 1388145 B1 EP1388145 B1 EP 1388145B1
- Authority
- EP
- European Patent Office
- Prior art keywords
- sub
- information
- band
- rhythm
- signal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/90—Pitch determination of speech signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H1/00—Details of electrophonic musical instruments
- G10H1/36—Accompaniment arrangements
- G10H1/40—Rhythm
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
- G10H2210/076—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for extraction of timing, tempo; Beat detection
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2250/00—Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
- G10H2250/131—Mathematical functions for musical analysis, processing, synthesis or composition
- G10H2250/135—Autocorrelation
Definitions
- the present invention relates to signal processing concepts and especially on the analysis of audio signals regarding rhythm information.
- the determination or extraction is of particular interest of characteristics that are not only signal-theoretical, but as possible have immediate semantic meaning, d. H. from Represent listeners' properties that are immediately felt.
- the tempo is an important musical one Parameter that has semantic meaning.
- the tempo is usually measured in "beats per minute” (BPM).
- BPM beats per minute
- the automatic extraction of the tempo as well as the focal points the "beat” or generally speaking the automatic extraction of rhythm information, is an example of the extraction of a semantically important feature of a piece of music.
- Beat tracking For the determination of the center of gravity and thus also the Tempo, d. H. for the determination of rhythm information, the term "beat tracking" has also established. It is already known from the prior art Beat tracking based on a note-like or transcribed Signal representation, e.g. B. in midi format. However, the goal is to avoid such meta representations need, but an analysis directly with a z. B. PCM encoded or generally speaking digitally present Audio signal.
- the input signal is generated using a Filter bank split into a number of sub-bands, for example in 6 subbands with crossover frequencies of 200 Hz, 400 Hz, 800 Hz, 1600 Hz and 3200 Hz.
- For the first subband low pass filtering is performed.
- Sub-band high pass filtering is performed for which The remaining sub-volumes in between are bandpass filtered described.
- Each sub-band is processed as follows. The subband signal is first rectified.
- the absolute value of the samples certainly.
- the resulting n values are then smoothed, for example with an averaging over a suitable one Window to get an envelope signal.
- the envelope signals are differentiated, i. H. sudden changes in signal amplitude are caused by the Differentiation filter preferably forwarded.
- the result is then limited to non-negative values.
- Any envelope signal is then placed in a bank of resonant filters, i.e. H. oscillators, given a filter for each tempo range included so that it fits the musical tempo Filter is most excited.
- the Energy of the output signal as a measure of the match the tempo of the input signal with that belonging to the filter Tempo calculated.
- the energies for each pace will eventually be summed up over all subbands, the largest energy sum the pace delivered as a result, d. H. the rhythm information, features.
- a major disadvantage of this method is that great computing and storage complexity, especially for implementation the large number of parallel oscillating "oscillators", ultimately only one is selected. This makes an efficient implementation for example for Real-time applications almost impossible.
- the known algorithm is shown in FIG. 3 as a block diagram.
- the audio signal becomes one via an audio input 300 Analysis filter bank 302 fed.
- the analysis filter bank generates a number n of channels from the audio input, i. H. of individual subband signals. Each subband signal contains a certain range of frequencies of the audio signal.
- the Filters of the analysis filter bank are selected so that they Approach the selection characteristics of the human inner ear.
- Such an analysis filter bank is also called a gamma-tone filter bank designated.
- the rhythm information evaluated each subband signal.
- the input signal first becomes an envelope-like output signal calculated (according to a so-called “Inner Hair Cell "processing in the ear) and subsampled. From this result an autocorrelation function (AKF) is calculated to the periodicity of the signal as a function of delay, d. H. to get the "lag".
- AMF autocorrelation function
- the output of the devices 304a to 304c then lies for each Subband signal an autocorrelation function, which Represents aspects of the rhythm information of each subband signal.
- the individual autocorrelation functions of the subband signals are then combined in a device 306 by summation, to get a sum auto-correlation function (SAKF), which is the rhythm information of the signal at the audio input 300 reproduces.
- SAKF sum auto-correlation function
- This information can be found on a Tempo output 308 are output.
- Large values in the sum autocorrelation indicate that for a one tip the SAKF associated delay (lag) a high periodicity of Beginnings of grades are available. Therefore, for example, the largest Value of the sum auto-correlation function within the musical reasonable delays sought.
- Musically sensible delays are, for example Tempo range between 60 bpm and 200 bpm.
- the device 306 can also be arranged to include a delay time in tempo information implement. For example, one Peak a delay of one second at a tempo of 60 Beats per minute. Smaller delays indicate higher ones Tempos down, while major delays to smaller tempos than 60 bpm.
- This method has compared to the first method an advantage in that no large oscillators Computing and storage effort must be implemented.
- the concept is disadvantageous in that the Quality of the results depends very much on the type of audio signal depends. For example, is a dominant from an audio signal To hear the rhythm instrument, this is shown in Fig. 3 described concept work well. Is against that Voice dominant, which has no particularly clear rhythm information will deliver, so the rhythm determination be ambiguous.
- the audio signal could also be a Band that only contains rhythm information, such as B. a higher frequency band in which, for example, a Hihat a drum kit is positioned, or a low one Frequency band in which the bass drum of a drum set the frequency scale is positioned. Because of the combination However, the individual information becomes the somewhat unambiguous Information of these special subbands from the ambiguous Information of the other subbands overlaid or "Watered down".
- the specialist publication "A Computationally Efficient Multipitch Analysis Model” discloses a computationally efficient model for a periodicity analysis of complex audio signals.
- the computing model divides the signal into two channels, one Channel below 1000 Hz and one channel above 1000 Hz. This becomes an autocorrelation of the lower channel and an autocorrelation the envelope of the upper channel. Finally the two autocorrelation functions are summed up. To the Eliminate ambiguities in the sum autocorrelation function, the sum autocorrelation function is processed further, a so-called enhanced summary autocorrelation Function (ESACF) (further developed sum autocorrelation function) to obtain.
- ESACF enhanced summary autocorrelation Function
- the object of the present invention is to make computing time efficient and robust device and a computationally efficient and robust method for analyzing an audio signal regarding rhythm information.
- This task is accomplished by a device for analyzing a Audio signal according to claim 1 or by a method for analyzing an audio signal according to claim 11 solved.
- the present invention is based on the knowledge that in the individual frequency bands, d. H. the sub-bands, often different favorable conditions for finding of rhythmic periodicities. While for example with pop music often in the middle, for example around 1 kHz, the signal from not corresponding to the beat Singing is dominated are in the higher frequency ranges often especially percussion sounds present, such as B. the hihat of the drums, which is a very good extraction allow rhythmic regularities. In other words, include different frequency bands depending on the audio signal a different amount of rhythmic information or have different quality or significance for that Rhythm information of the audio signal.
- the audio signal is therefore first divided into subband signals disassembled.
- Each subband signal is regarding its periodicity examined to provide raw rhythm information to get for each subband signal.
- an assessment of the quality of periodicity each subband signal performed to a level of significance to get for each subband signal.
- a high level of significance indicates that in this subband signal there is clear rhythm information, while a low Significance level suggests that in this subband signal less clear rhythm information is available.
- a modified one Envelope of the subband signal is calculated and then an autocorrelation function the envelope is calculated.
- the autocorrelation function the envelope provides the raw rhythm information There is clear rhythm information when the autocorrelation function has clear maxima while less there is clear rhythm information when the autocorrelation function the envelope of the subband signal less pronounced signal peaks or no signal peaks at all Has.
- An autocorrelation function that clearly peaks has a high degree of significance, while an autocorrelation function that has a relatively flat History, a low level of significance is obtained.
- the individual rhythm raw information of the individual subband signals not simply "blindly” combined, but taking into account the measure of significance for each sub-band signal used the rhythm information to get the audio signal.
- a subband signal is high Significance measure, so it will be when determining the rhythm information preferred while a subband signal is a has a low level of significance, d. H. that a low quality in terms of rhythm information when determining the rhythm information of the audio signal hardly or in extreme cases it is not considered at all.
- a weighting factor be implemented, which depends on the significance measure. While a subband signal that is good quality for that Has rhythm information, d. H. which is a high level of significance has a weighting factor of 1, a another subband signal that has a smaller measure of significance, get a weighting factor less than 1. In extreme cases becomes a subband signal that has a completely flat autocorrelation function has a weighting factor of 0. The weighted auto-correlation functions, d. H. the weighted Rhythm raw information is then simply added up.
- the concept according to the invention is advantageous in that it enables a robust determination of the rhythm information, since subband signals with no clear or even deviating rhythm information, d. H. when singing one has a different rhythm than the actual beat of the Piece that does not "water down" the rhythm information of the audio signal or "falsify". They also become very intoxicating Subband signals, which have a system autocorrelation function deliver the signal / noise ratio with a completely flat curve in determining the rhythm information don't worsen. However, this is exactly what would occur if, as in the prior art, simply all auto-correlation functions of subband signals with the same weight be added up.
- Another advantage of the method according to the invention is in that with a little extra computational effort Significance measure can be determined and that the assessment the rhythm raw information with the significance measure and the subsequent summation without much storage and computing time can be carried out efficiently, what the inventive Concept especially for real-time applications recommends.
- FIG. 1 shows a block diagram of a device for analysis an audio signal with respect to rhythm information.
- the audio signal is received via an input 100 of a device 102 for splitting the audio signal into at least two Subband signals 104a and 104b supplied. Any subband signal 104a, 104b is inserted into a device 106a or 106b 'for examination the same in terms of periodicities in the subband signal supplied to raw rhythm information 108a or 108b for each subband signal.
- the raw rhythm information are then a device 110a or 110b to assess a quality of the periodicity of each of the at least two subband signals fed to a significance measure 112a, 112b for each of the at least two subband signals.
- Both the raw rhythm information 108a, 108b and the significance measures 112a, 112b also become a device 114 for determining the rhythm information of the audio signal fed.
- the device 114 takes into account when determining the rhythm information of the audio signal the significance measures 112a, 112b for the subband signals and the raw rhythm information 108a, 108b of at least one subband signal.
- the device 110a for quality evaluation found that in the subband signal 104a no particular Periodicity is present, so the significance measure 112a becomes very be small or equal to 0.
- the rhythm information of the audio signal are then alone and exclusively based on raw rhythm information 108b of the subband signal 104b determined.
- Means 102 for splitting the audio signal can be a conventional one Analysis filter bank used to be the output side a user selectable number of subband signals supplies.
- Each subband signal is then processed Devices 106a, 106b and 106c subjected, then by means 110a to 110c of any rhythm raw information Significance measures can be determined.
- the facility 114 includes the preferred one shown in FIG Embodiment means 114a for computing Weighting factors for each subband signal based on the Significance measure for this subband signal and optionally also of the other subband signals.
- the concept according to the invention thus arises as follows After evaluating the rhythmic information of the Single tapes, which are generated, for example, by envelope formation, Smoothing, differentiation, limitation to positive values and Forming the autocorrelation function can take place (facilities 106a to 106c) finds an evaluation of the value or the quality of these interim results in the facilities 110a to 110c instead. This is done with the help of an evaluation function achieved which the reliability of each Individual results assessed with a significance measure. From the Significance measures of all subband signals become a weighting factor for each band for the extraction of the rhythm information derived. The overall result of rhythm extraction is then combined in the device 114b the band-by-band individual results, taking into account their respective weighting factors reached.
- the rhythm raw information 108a, 108b, 108c which determine the periodicity of the represent the respective subband signal by means of an autocorrelation function certainly.
- any autocorrelation function always a local maximum with a delay of 0 that represents the energy of the signal. This maximum should be disregarded so that the quality determination is not falsified.
- the autocorrelation function should only be in one special tempo range are considered, d. H. of a maximum Delay the slowest pace of interest corresponds to a minimum delay that corresponds to the highest pace of interest.
- a typical pace area is between 60 bpm and 200 bpm.
- the ratio between the arithmetic mean of the autocorrelation function in the interest Tempo range and the geometric mean the autocorrelation function in the tempo area of interest be determined. It is known that when all values of the autocorrelation function are the same, d. H. if the autocorrelation function has a flat course, the geometric Average of the autocorrelation function and the arithmetic Mean of the autocorrelation function are the same. In this Case the significance measure would have a value equal to 1, which means that the rhythm raw information is not significant are.
- weighting factors there are various options with regard to the weighting factors.
- a relative weighting is preferred, such that all weighting factors of all subband signals add up to 1, d. H. that the weighting factor of a band is determined to be divided as the significance value of this band by the sum of all significance values. In this case is a relative weighting before the totalization of the weighted Rhythm raw information performed to the rhythm information to get the audio signal.
- the audio signal is via the audio signal input 100 in the device 102 for decomposing the audio signal fed into subband signals 104a and 104b. Any subband signal is then in the device 106a or 106b as it has been performed using an autocorrelation function examined the periodicity of the subband signal to investigate.
- the raw rhythm information 108a, 108b is then available. This are fed into a device 118a or 118b in order to by means of the facility's autocorrelation function 116a postprocessed raw rhythm information. So u. a. ensured that the ambiguities of the Autocorrelation function, d. H. that with integer multiples the delays also occur, can be eliminated on a sub-band basis to post-processed raw rhythm information To get 120a or 120b.
- the quality assessment can also be based on the postprocessed rhythm raw information take place, whereby This latter option is preferred because of the quality assessment on the basis of the reworked raw rhythm information ensures that the quality of information is judged that is no longer ambiguous.
- the establishment of the rhythm information by the facility 114 then takes place on the basis of post-processed rhythm information of a channel and preferably also on the Basis of the significance measure for this channel instead.
- a quality assessment based on the raw rhythm information that is, the signal before device 118a is carried out, it is advantageous in that if it is determined that the measure of significance is 0, i. H. that the autocorrelation function has a flat course has the post-processing by means of the device 118a can be completely dispensed with in order to save computing time resources.
- Fig. 5 is discussed in order to provide a more detailed description Setup of a device 118a or 118b for post-processing the raw rhythm information.
- the ambiguity Eliminating part of a band can be the same as in the stand the technology, a spread autocorrelation function by means of a device 121 are calculated, the device 121 is arranged to the spread auto-correlation function to be calculated so that it is an integer multiple of one Delay is spread.
- a device 122 is in this Case arranged to use the spread auto-correlation function from the original autocorrelation function, d. H. the Subtract raw rhythm information 108a.
- the device 121 can be arranged be an auto-correlation function compressed by an integer factor to calculate, this then from the facility 122 added to the raw rhythm information in order to also generate shares for delays t0 / 2, t0 / 3 etc.
- the spread or compressed versions the rhythm raw information 108a before adding or Subtract to be weighted for flexibility to achieve in the sense of high robustness.
- the periodicity of a subband signal to investigate on the basis of an autocorrelation function a further improvement can be achieved if the Properties of the autocorrelation function are included and post-processing using the facility 118a or 118b is performed.
- a periodic Sequence of note starts with a distance t0 not just one AKF peak with a delay t0 but also with 2t0, 3t0, etc. This becomes an ambiguity in tempo detection, d. H. the search for significant maxima in the autocorrelation function, to lead. This can eliminate the ambiguities when versions are spread by integer factors the AKF is subtracted from the initial value sub-band (weighted) become.
- AKF post-processing therefore takes place sub-band, whereby an autocorrelation function for at least one subband signal is calculated and this with stretched or spread Versions of this feature is combined.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Electrophonic Musical Instruments (AREA)
- Auxiliary Devices For Music (AREA)
Description
- Fig. 1
- ein Blockschaltbild einer Vorrichtung zum Analysieren eines Audiosignals mit einer Qualitätsbewertung der Rhythmus-Rohinformationen;
- Fig. 2
- ein Blockschaltbild einer Vorrichtung zum Analysieren eines Audiosignals unter Verwendung von Gewichtungsfaktoren auf der Basis der Signifikanzmaße;
- Fig. 3
- ein Blockschaltbild einer bekannten Vorrichtung zum Analysieren eines Audiosignals hinsichtlich von Rhythmusinformationen;
- Fig. 4
- ein Blockschaltbild einer Vorrichtung zum Analysieren eines Audiosignals hinsichtlich von Rhythmusinformationen unter Verwendung einer Autokorrelationsfunktion mit einer teilbandweisen Nachbearbeitung der RhythmusRohinformationen; und
- Fig. 5
- ein detailliertes Blockschaltbild der Einrichtung zum Nachbearbeiten von Fig. 4.
Claims (11)
- Vorrichtung zum Analysieren eines Audiosignals hinsichtlich von Rhythmusinformationen des Audiosignals, mit folgenden Merkmalen:einer Einrichtung (102) zum Zerlegen des Audiosignals in zumindest zwei Teilbandsignale (104a, 104b);einer Einrichtung zum Untersuchen (106a, 106b) eines Teilbandsignals hinsichtlich einer Periodizität in dem Teilbandsignal, um Rhythmus-Rohinformationen (108a, 108b) für das Teilbandsignal zu erhalten;einer Einrichtung zum Bewerten (110a, 110b) einer Qualität der Periodizität der Rhythmus-Rohinformationen (108a) des Teilbandsignals (104a), um ein Signifikanzmaß (112a) für das Teilbandsignal zu erhalten; undeiner Einrichtung (114) zum Ermitteln der Rhythmusinformationen des Audiosignals unter Berücksichtigung des Signifikanzmaßes (112a) des Teilbandsignals und der Rhythmus-Rohinformationen (108a, 108b) von zumindest einem Teilbandsignal.
- Vorrichtung nach Anspruch 1, bei der die Einrichtung zum Untersuchen (106a, 106b) ausgebildet ist, um eine Autokorrelationsfunktion für jedes der zumindest zwei Teilbandsignale zu berechnen.
- Vorrichtung nach Anspruch 1 oder 2, bei die Einrichtung zum Untersuchen (106a, 106b) folgende Merkmale aufweist:eine Einrichtung zum Bilden einer Hüllkurve eines Teilbandsignals;eine Einrichtung zum Glätten der Hüllkurve des Teilbandsignals, um eine geglättete Hüllkurve zu erhalten;eine Einrichtung zum Differenzieren der geglätteten Hüllkurve, um eine differenzierte Hüllkurve zu erhalten;eine Einrichtung zum Begrenzen der differenzierten Hüllkurve auf positive Werte, um eine begrenzte Hüllkurve zu erhalten; undeine Einrichtung zum Bilden einer Autokorrelationsfunktion der begrenzten Hüllkurve, um die Rhythmus-Rohinformationen (108a, 108b) zu erhalten.
- Vorrichtung nach Anspruch 2 oder 3, bei der die Einrichtung zum Bewerten (110a, 110b) der Qualität ausgebildet ist, um als Signifikanzmaß ein Verhältnis eines Maximum der Autokorrelationsfunktion zu einem Mittelwert der Autokorrelationsfunktion zu verwenden.
- Vorrichtung nach Anspruch 2 oder 3, bei der die Einrichtung zum Bewerten (110a, 110b) der Qualität ausgebildet ist, um als Signifikanzmaß ein Verhältnis aus einem arithmetischen Mittelwert der Rhythmus-Rohinformationen zu einem geometrischen Mittelwert der Rhythmus-Rohinformationen zu verwenden.
- Vorrichtung nach Anspruch 4 oder 5,
bei der die Einrichtung zum Bewerten (110a, 110b) der Qualität ausgebildet ist, um die Autokorrelationsfunktion lediglich innerhalb eines Tempobereichs auszuwerten, der sich von einer minimalen Verzögerung, um ein maximales Tempo zu erhalten, zu einer maximalen Verzögerung, um ein minimales Tempo zu erhalten, erstreckt. - Vorrichtung nach einem der vorhergehenden Ansprüche, bei der die Einrichtung zum Ermitteln (114) folgende Merkmale aufweist:eine Einrichtung (114a) zum Ableiten eines Gewichtungsfaktors für ein Teilband unter Verwendung des Signifikanzmaßes für das Teilband;eine Einrichtung (114b) zum Gewichten der Rhythmus-Rohinformationen des Teilbands unter Verwendung des Gewichtungsfaktors für das Teilband, um gewichtete Rhythmus-Rohinformationen für das Teilband zu erhalten, und zum Zusammenfassen der gewichteten Rhythmus-Rohinformationen des Teilbands mit gewichteten oder ungewichteten Rhythmus-Rohinformationen des anderen Teilbands, um die Rhythmusinformationen des Audiosignals zu erhalten.
- Vorrichtung nach Anspruch 7, bei der die Einrichtung (114a) zum Ableiten eines Gewichtungsfaktors angeordnet ist, um für jedes Teilbandsignal einen relativen Gewichtungsfaktor abzuleiten, wobei eine Summe aus den Gewichtungsfaktoren für alle Teilbandsignale 1 ergibt.
- Vorrichtung nach Anspruch 8, bei der die Einrichtung (114a) zum Ableiten eines Gewichtungsfaktors angeordnet ist, um einen Gewichtungsfaktor als Verhältnis aus dem Signifikanzmaß eines Teilbandsignals zu der Summe der Signifikanzmaße aller Teilbandsignale abzuleiten.
- Vorrichtung nach Anspruch 9, bei der die Einrichtung (106a, 106b) zum Untersuchen eines Teilbandsignals angeordnet ist, um ein Teilbandsignal zu untersuchen, dessen Länge größer als 10 Sekunden ist.
- Verfahren zum Analysieren eines Audiosignals hinsichtlich von Rhythmusintormationen des Audiosignals, mit folgenden Schritten:Zerlegen des Audiosignals in zumindest zwei Teilbandsignale (104a, 104b);Untersuchen (106a, 106b) eines Teilbandsignals hinsichtlich einer Periodizität in dem Teilbandsignal, um Rhythmus-Rohinformationen (108a, 108b) für das Teilbandsignal zu erhalten;Bewerten (110a, 110b) einer Qualität der Periodizität der Rhythmus-Rohinformationen (108a) des Teilbandsignals (104a), um ein Signifikanzmaß (112a) für das Teilbandsignal zu erhalten; undErmitteln der Rhythmusinformationen des Audiosignals unter Berücksichtigung des Signifikanzmaßes (112a) des Teilbandsignals und der Rhythmus-Rohinformationen (108a, 108b) von zumindest einem Teilbandsignal.
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| DE10123366A DE10123366C1 (de) | 2001-05-14 | 2001-05-14 | Vorrichtung zum Analysieren eines Audiosignals hinsichtlich von Rhythmusinformationen |
| DE10123366 | 2001-05-14 | ||
| PCT/EP2002/004618 WO2002093557A1 (de) | 2001-05-14 | 2002-04-25 | Vorrichtung und verfahren zum analysieren eines audiosignals hinsichtlich von rhythmusinformationen |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| EP1388145A1 EP1388145A1 (de) | 2004-02-11 |
| EP1388145B1 true EP1388145B1 (de) | 2004-10-13 |
Family
ID=7684710
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| EP02745267A Expired - Lifetime EP1388145B1 (de) | 2001-05-14 | 2002-04-25 | Vorrichtung und verfahren zum analysieren eines audiosignals hinsichtlich von rhythmusinformationen |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US20040068401A1 (de) |
| EP (1) | EP1388145B1 (de) |
| JP (1) | JP3914878B2 (de) |
| AT (1) | ATE279769T1 (de) |
| DE (2) | DE10123366C1 (de) |
| WO (1) | WO2002093557A1 (de) |
Families Citing this family (21)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8535236B2 (en) * | 2004-03-19 | 2013-09-17 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus and method for analyzing a sound signal using a physiological ear model |
| JP4979018B2 (ja) | 2005-07-01 | 2012-07-18 | パイオニア株式会社 | コンピュータプログラム、並びに情報再生装置及び方法 |
| JP4940588B2 (ja) | 2005-07-27 | 2012-05-30 | ソニー株式会社 | ビート抽出装置および方法、音楽同期画像表示装置および方法、テンポ値検出装置および方法、リズムトラッキング装置および方法、音楽同期表示装置および方法 |
| JP4948118B2 (ja) * | 2005-10-25 | 2012-06-06 | ソニー株式会社 | 情報処理装置、情報処理方法、およびプログラム |
| JP4465626B2 (ja) * | 2005-11-08 | 2010-05-19 | ソニー株式会社 | 情報処理装置および方法、並びにプログラム |
| CN101326569B (zh) | 2005-12-09 | 2012-07-18 | 索尼株式会社 | 音乐编辑设备和音乐编辑方法 |
| JP4949687B2 (ja) | 2006-01-25 | 2012-06-13 | ソニー株式会社 | ビート抽出装置及びビート抽出方法 |
| JP5351373B2 (ja) * | 2006-03-10 | 2013-11-27 | 任天堂株式会社 | 演奏装置および演奏制御プログラム |
| US8184712B2 (en) | 2006-04-30 | 2012-05-22 | Hewlett-Packard Development Company, L.P. | Robust and efficient compression/decompression providing for adjustable division of computational complexity between encoding/compression and decoding/decompression |
| US7645929B2 (en) * | 2006-09-11 | 2010-01-12 | Hewlett-Packard Development Company, L.P. | Computational music-tempo estimation |
| US8463615B2 (en) * | 2007-07-30 | 2013-06-11 | Google Inc. | Low-delay audio coder |
| WO2009125489A1 (ja) * | 2008-04-11 | 2009-10-15 | パイオニア株式会社 | テンポ検出装置及びテンポ検出プログラム |
| US8168876B2 (en) * | 2009-04-10 | 2012-05-01 | Cyberlink Corp. | Method of displaying music information in multimedia playback and related electronic device |
| US8996538B1 (en) | 2009-05-06 | 2015-03-31 | Gracenote, Inc. | Systems, methods, and apparatus for generating an audio-visual presentation using characteristics of audio, visual and symbolic media objects |
| US8071869B2 (en) * | 2009-05-06 | 2011-12-06 | Gracenote, Inc. | Apparatus and method for determining a prominent tempo of an audio work |
| US8805854B2 (en) * | 2009-06-23 | 2014-08-12 | Gracenote, Inc. | Methods and apparatus for determining a mood profile associated with media data |
| WO2014132102A1 (en) * | 2013-02-28 | 2014-09-04 | Nokia Corporation | Audio signal analysis |
| JP6759545B2 (ja) * | 2015-09-15 | 2020-09-23 | ヤマハ株式会社 | 評価装置およびプログラム |
| EP3576088A1 (de) | 2018-05-30 | 2019-12-04 | Fraunhofer Gesellschaft zur Förderung der Angewand | Audioähnlichkeitsauswerter, audiokodierer, verfahren und computerprogramm |
| US10666475B2 (en) * | 2018-10-29 | 2020-05-26 | Bae Systems Information And Electronic Systems Integration Inc. | Techniques for phase modulated signals having poor autocorrelation |
| CN111785237B (zh) * | 2020-06-09 | 2024-04-19 | Oppo广东移动通信有限公司 | 音频节奏确定方法、装置、存储介质和电子设备 |
Family Cites Families (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| GB2207027B (en) * | 1987-07-15 | 1992-01-08 | Matsushita Electric Works Ltd | Voice encoding and composing system |
| JP2591430B2 (ja) * | 1993-06-30 | 1997-03-19 | 日本電気株式会社 | ベクトル量子化装置 |
| JP3840684B2 (ja) * | 1996-02-01 | 2006-11-01 | ソニー株式会社 | ピッチ抽出装置及びピッチ抽出方法 |
| JPH09293083A (ja) * | 1996-04-26 | 1997-11-11 | Toshiba Corp | 楽曲検索装置および検索方法 |
| KR100269216B1 (ko) * | 1998-04-16 | 2000-10-16 | 윤종용 | 스펙트로-템포럴 자기상관을 사용한 피치결정시스템 및 방법 |
| DE10123281C1 (de) * | 2001-05-14 | 2002-10-10 | Fraunhofer Ges Forschung | Vorrichtung und Verfahren zum Analysieren eines Audiosignals hinsichtlich von Rhythmusinformationen des Audiosignals unter Verwendung einer Autokorrelationsfunktion |
| JP3859462B2 (ja) * | 2001-05-18 | 2006-12-20 | 株式会社東芝 | 予測パラメータ分析装置および予測パラメータ分析方法 |
-
2001
- 2001-05-14 DE DE10123366A patent/DE10123366C1/de not_active Expired - Fee Related
-
2002
- 2002-04-25 AT AT02745267T patent/ATE279769T1/de not_active IP Right Cessation
- 2002-04-25 EP EP02745267A patent/EP1388145B1/de not_active Expired - Lifetime
- 2002-04-25 US US10/467,704 patent/US20040068401A1/en not_active Abandoned
- 2002-04-25 DE DE2002501311 patent/DE50201311D1/de not_active Expired - Lifetime
- 2002-04-25 WO PCT/EP2002/004618 patent/WO2002093557A1/de not_active Ceased
- 2002-04-25 JP JP2002590149A patent/JP3914878B2/ja not_active Expired - Lifetime
Also Published As
| Publication number | Publication date |
|---|---|
| DE10123366C1 (de) | 2002-08-08 |
| WO2002093557A1 (de) | 2002-11-21 |
| DE50201311D1 (de) | 2004-11-18 |
| EP1388145A1 (de) | 2004-02-11 |
| HK1059959A1 (en) | 2004-07-23 |
| JP2004528596A (ja) | 2004-09-16 |
| US20040068401A1 (en) | 2004-04-08 |
| JP3914878B2 (ja) | 2007-05-16 |
| ATE279769T1 (de) | 2004-10-15 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| EP1371055B1 (de) | Vorrichtung zum analysieren eines audiosignals hinsichtlich von rhythmusinformationen des audiosignals unter verwendung einer autokorrelationsfunktion | |
| EP1388145B1 (de) | Vorrichtung und verfahren zum analysieren eines audiosignals hinsichtlich von rhythmusinformationen | |
| DE10232916B4 (de) | Vorrichtung und Verfahren zum Charakterisieren eines Informationssignals | |
| EP2099024B1 (de) | Verfahren zur klangobjektorientierten Analyse und zur notenobjektorientierten Bearbeitung polyphoner Klangaufnahmen | |
| Carlson et al. | Two-formant models, pitch, and vowel perception | |
| EP1368805B1 (de) | Verfahren und vorrichtung zum charakterisieren eines signals und verfahren und vorrichtung zum erzeugen eines indexierten signals | |
| EP1606798B1 (de) | Vorrichtung und verfahren zum analysieren eines audio-informationssignals | |
| DE69901606T2 (de) | Breitbandsprachsynthese von schmalbandigen sprachsignalen | |
| DE69607223T2 (de) | Verfahren zur Tonhöhenerkennung, insbesondere für Zupf- oder Perkussionsinstrumente | |
| EP2351017B1 (de) | Verfahren zur erkennung von notenmustern in musikstücken | |
| DE10134471C2 (de) | Verfahren und Vorrichtung zum Charakterisieren eines Signals und Verfahren und Vorrichtung zum Erzeugen eines indexierten Signals | |
| EP1280138A1 (de) | Verfahren zur Analyse von Audiosignalen | |
| EP1563487B1 (de) | Verfahren zur ermittlung akustischer merkmale von schallsignalen fuer die analyse unbekannter schallsignale und modifikation einer schallerzeugung | |
| DE102004028694B3 (de) | Vorrichtung und Verfahren zum Umsetzen eines Informationssignals in eine Spektraldarstellung mit variabler Auflösung | |
| DE102004028693B4 (de) | Vorrichtung und Verfahren zum Bestimmen eines Akkordtyps, der einem Testsignal zugrunde liegt | |
| EP1377924B1 (de) | VERFAHREN UND VORRICHTUNG ZUM EXTRAHIEREN EINER SIGNALKENNUNG, VERFAHREN UND VORRICHTUNG ZUM ERZEUGEN EINER DAZUGEHÖRIGEN DATABANK und Verfahren und Vorrichtung zum Referenzieren eines Such-Zeitsignals | |
| DE102004033867B4 (de) | Verfahren und Vorrichtung zur rhythmischen Aufbereitung von Audiosignalen | |
| EP1743324B1 (de) | Vorrichtung und verfahren zum analysieren eines informationssignals | |
| Krusche | Visualization and auralization of features learned by neural networks for musical instrument recognition | |
| EP1381024A1 (de) | Verfahren zum Auffinden einer Tonfolge |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
| 17P | Request for examination filed |
Effective date: 20030528 |
|
| AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR |
|
| GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
| RIN1 | Information on inventor provided before grant (corrected) |
Inventor name: UHLE, CHRISTIAN Inventor name: CREMER, MARKUS Inventor name: HERRE, JUERGEN Inventor name: ROHDEN, JAN |
|
| RIC1 | Information provided on ipc code assigned before grant |
Ipc: 7G 10L 11/00 A |
|
| RBV | Designated contracting states (corrected) |
Designated state(s): AT CH DE FR GB LI |
|
| REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 1059959 Country of ref document: HK |
|
| GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
| GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
| AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): AT CH DE FR GB LI |
|
| REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D Free format text: NOT ENGLISH |
|
| REG | Reference to a national code |
Ref country code: CH Ref legal event code: EP |
|
| REF | Corresponds to: |
Ref document number: 50201311 Country of ref document: DE Date of ref document: 20041118 Kind code of ref document: P |
|
| GBT | Gb: translation of ep patent filed (gb section 77(6)(a)/1977) | ||
| REG | Reference to a national code |
Ref country code: HK Ref legal event code: GR Ref document number: 1059959 Country of ref document: HK |
|
| PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: AT Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20050425 |
|
| ET | Fr: translation filed | ||
| PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
| 26N | No opposition filed |
Effective date: 20050714 |
|
| PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: CH Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20060430 Ref country code: LI Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20060430 |
|
| REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
| REG | Reference to a national code |
Ref country code: GB Ref legal event code: 732E Free format text: REGISTERED BETWEEN 20090409 AND 20090415 |
|
| REG | Reference to a national code |
Ref country code: FR Ref legal event code: TP |
|
| PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20120427 Year of fee payment: 11 |
|
| PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20120425 Year of fee payment: 11 |
|
| REG | Reference to a national code |
Ref country code: DE Ref legal event code: R119 Ref document number: 50201311 Country of ref document: DE |
|
| GBPC | Gb: european patent ceased through non-payment of renewal fee |
Effective date: 20130425 |
|
| PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20130425 Ref country code: DE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20131101 |
|
| REG | Reference to a national code |
Ref country code: DE Ref legal event code: R079 Ref document number: 50201311 Country of ref document: DE Free format text: PREVIOUS MAIN CLASS: G10L0011000000 Ipc: G10L0025060000 |
|
| REG | Reference to a national code |
Ref country code: DE Ref legal event code: R119 Ref document number: 50201311 Country of ref document: DE Effective date: 20131101 Ref country code: DE Ref legal event code: R079 Ref document number: 50201311 Country of ref document: DE Free format text: PREVIOUS MAIN CLASS: G10L0011000000 Ipc: G10L0025060000 Effective date: 20140929 |
|
| REG | Reference to a national code |
Ref country code: FR Ref legal event code: PLFP Year of fee payment: 15 |
|
| PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20160425 Year of fee payment: 15 |
|
| REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST Effective date: 20171229 |
|
| PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20170502 |