[go: up one dir, main page]

WO2008043140A1 - Procédé et système pour coder des données dans un signal audio - Google Patents

Procédé et système pour coder des données dans un signal audio Download PDF

Info

Publication number
WO2008043140A1
WO2008043140A1 PCT/AU2007/001532 AU2007001532W WO2008043140A1 WO 2008043140 A1 WO2008043140 A1 WO 2008043140A1 AU 2007001532 W AU2007001532 W AU 2007001532W WO 2008043140 A1 WO2008043140 A1 WO 2008043140A1
Authority
WO
WIPO (PCT)
Prior art keywords
signal
phase
audio signal
frequency band
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/AU2007/001532
Other languages
English (en)
Inventor
Jeffrey L. Pages
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Innes Corp Pty Ltd
Original Assignee
Innes Corp Pty Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2006905656A external-priority patent/AU2006905656A0/en
Application filed by Innes Corp Pty Ltd filed Critical Innes Corp Pty Ltd
Publication of WO2008043140A1 publication Critical patent/WO2008043140A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/018Audio watermarking, i.e. embedding inaudible data in the audio signal

Definitions

  • the present invention relates to encoding techniques for inserting data into an audio signal for the purpose of subsequent extraction of the data from the encoded audio signal.
  • TVS Traffic Verification System
  • TVS proved to be an effective audio tagging technique. However, in certain circumstances some issues were observed. In audio signals containing pure tones at the removed frequency band, a degradation in audio quality was observed.
  • a hypothetical explanation for the degradation relates to what happens when the narrow band of frequencies is removed from the audio signal. Not only would the amplitude information of the removed band be removed but also the phase information. While the data signal is modulated so as to replicate the amplitude of the removed band, potentially the phase information is not replicated.
  • a method of encoding an audio signal with data comprising the steps of: a) dividing the audio signal into a plurality of frequency band signals; b) obtaining a first bit of the data; c) adjusting the phase of one or more of the frequency band signals in accordance with the value of the first bit and a reference phase; d) following the adjustment, recombining the plurality of frequency band signals to form an encoded audio signal; and e) sequentially repeating steps a) to d) for each subsequent bit of the data.
  • step a) comprises the steps of: down-sampling the rate of the audio signal to a lower rate; and applying a Fast Fourier Tra ⁇ sform to the down-sampled audio signal to obtain separate frequency band signals; and step d) includes the steps of: applying an inverse Fast Fourier Transform to the plurality of frequency band signals to provide a recombined signal; and up-sampling the recombined signal to the original audio signal rate.
  • the reference phase is obtained from a phase adjusted frequency band signal.
  • the reference phase may be obtained from the phase adjusted frequency band signal by low-pass filtering the phase adjusted signal and limiting the low-pass filtered signal.
  • the low-pass filtering can be applied by a FIR filter.
  • the step of adjusting the phase comprises adding or subtracting an offset to the quadrature (Q) component and limiting the signal to restore original signal envelope.
  • the amount of offset may be adjustable.
  • a method of extracting data from an audio signal that has been encoded according to the above mentioned encoding method comprising the steps of: dividing the encoded audio signal into a plurality of frequency band signals to extract each phase adjusted frequency band signal; for each extracted phase adjusted frequency band signal: deriving the reference phase; determining the phase difference between the reference phase and the phase of the phase adjusted frequency band signal; determining a first bit of the data value in accordance with the determined phase difference; and sequentially repeating the preceding steps for each subsequent bit of the data until the data has been fully extracted.
  • the dividing step comprises: down-sampling the rate of the encoded audio signal to a lower rate; and applying a Fast Fourier Transform to the down-sampled encoded audio signal to obtain separate frequency band signals.
  • the step of deriving the reference phase comprises low-pass filtering the phase adjusted frequency band signal and limiting the low-pass filtered signal. This low-pass filtering may be applied by a FIR filter.
  • Preferred embodiments of the present invention advantageously provide an encoding technique which manages to minimise causing an appreciable degradation of the audible quality of the encoded audio signal.
  • Fig. 1 illustrates a functional block diagram of an encoder
  • Fig. 2 illustrates a functional block diagram of the phase perturbator for the encoder of Fig. 1 ;
  • Fig. 3 illustrates a functional block diagram of a convolutional code generator for the encoder of Fig. 1 ;
  • Fig. 4 illustrates a functional block diagram of a decoder. DESCRIPTION OF PREFERRED EMBODIMENTS
  • encoder and decoder which respectfully implement the encoding and decoding methods. It will be appreciated that the encoder and decoder can be implemented in software or hardware or a combination of both hardware and software.
  • FIG. 1 A functional block diagram of the encoder is shown in figure 1.
  • the audio signal input 12 at sampling rate Fs is down-sampled 14 to 12kHz and then passed through a fast fourier transform (FFT) 16 to divide the audio signal into a plurality of selected frequency bands.
  • FFT fast fourier transform
  • a 64 point FFT with raised cosine windowing and a 50% overlap can be used to separate the individual frequency bands.
  • 1125Hz 1 1500Hz, 1875Hz, 2250Hz 1 2625Hz, 3000Hz 1 3375Hz and 3750Hz are selected to carry the watermark data, while the remaining frequency bands are unused in the watermarking process. It will be appreciated that the number of bands and centre frequency of bands can be different. Having a plurality of bands provides redundancy which assists the decoding process.
  • the in-phase (l S j g ) and quadrature (Q S i g ) components of the FFT output are passed to a limiter stage 18 which separates this signal into a signal amplitude scalar and a pair of signal phase components lsigphase and Q S jg P hase, such that:
  • phase information carried by lsigphase and Qsigphase is modified by a phase perturbator 20 in accordance with the data bit to be encoded into the audio signal and then multiplied 22 by the previously obtained signal amplitude scalar to restore the original signal envelope.
  • the original signal is subtracted 24 from this perturbated signal and the resultant difference signals from each band are recombined in an inverse FFT 26.
  • the signal passing through the FFT stage 16 is windowed using a raised cosine function, it is necessary to convolve the bank of difference signals with the Fourier transform of the window function before applying the inverse FFT 26.
  • the contribution of the band n difference signal to the input of the inverse FFT 26 is
  • the output of the inverse FFT 26 is up-sampled 28 back to sampling rate Fs and added 30 to a delayed version of the Input.
  • the delay 32 is equal to the total delay through the sampling rate converters and fast fourier transforms.
  • the resultant output signal 34 is the audio signal with its phase information adjusted in the selected frequency bands.
  • the phase adjustment represents the encoded binary value of the data bit, i.e. 1 or 0. It will be appreciated that the above process has encoded only one bit of data into the audio signal. In practice the amount of data required to be encoded would consist of a number of bits. For certain commercial applications up to 70 or 80 bits may need to be encoded. However, it will be appreciated that larger numbers of bits could be encoded using the present invention. To encode the remainder of the data, the process is repeated for each subsequent bit of the data. Therefore, if one bit of data is encoded into one second of audio signal, a data size of 70 bits would be encoded into 70 seconds of audio signal.
  • a bipolar serial data signal 40 representing the watermark data, is set to an appropriate level as determined by the audibility versus robustness trade-off, combined with a phase reference 42 and added to the signal phase vector 44, such that
  • the effect of this is to add a component to the signal in a direction of either +90 degrees or -90 degrees (depending on the data binary value) relative to the phase reference 42.
  • the amount of offset added by the perturbator 20 is adjustable. At levels greater than about 0.2 it has been noted that the effect becomes audible so offsets less than this are recommended. The lowest amount of offset that can be detected reliably in the decoder depends on the bit rate as lower bit rates give longer integration times.
  • the reference phase 42 can be determined by any suitable means that can also be duplicated at the decoder. Given that the data encoding relates to a difference in phase, it is important that both the encoder and decoder can determine a common reference phase 42.
  • Fig. 1 A sample of the perturbated signal is low-pass filtered 50 (for example, using a 256-tap FIR filter) and limited 52 to create the phase reference for the perturbator.
  • This method of deriving the reference 42 ensures that the reference 42 can be replicated in the decoder, varies slowly with time and is reasonably uncorrelated with the input signal to the perturbator 20 with typical source material.
  • One potential limitation on this process is that the time lag introduced by the filter 50 makes the watermarking very sensitive to slight sampling rate offsets between the encoder and decoder.
  • the data bits for each frequency band are obtained from a rate 1/8 convolutional code, with the 8 coded bits derived from each of the raw watermark information bits providing forward error correction.
  • an 8-bit shift register 60 representing a constraint-length-8 code, is used in conjunction with a code generator lookup table 62 to produce the 8 data bits 40 that are applied in parallel to the 8 perturbators 20.
  • the audio input 70 to be decoded is firstly down-sampled 72 to 12kHz and then broken into the selected frequency bands using an FFT block 74 identical to that used in the encoder.
  • Each of the eight bands used to carry the watermark is limited 76, producing an in-phase and quadrature signal identical to the perturbated signal in the encoder.
  • the output of the phase comparator 82 is integrated 84 over one bit period, looking for the presence of a bias either positive or negative that is due to the watermark data phase perturbation.
  • the integrator outputs 86 from each of the eight frequency bands are passed to a Trellis decoder (not shown) which determines the maximum likelihood serial bit sequence corresponding to the detected watermark bits.
  • a cyclic redundancy check (CRC) embedded in the watermark data stream determines the validity of the recovered watermark.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing For Digital Recording And Reproducing (AREA)

Abstract

L'invention concerne un procédé de codage d'un signal audio (12) avec des données. Le signal audio (12) est divisé en une pluralité de signaux de bande de fréquence et un premier bit (40) des données est obtenu. La phase d'un ou de plusieurs signaux de bande de fréquence est ajustée conformément à la valeur du premier bit (40) et d'une phase de référence (42). Après cet ajustement, la pluralité de signaux de bande de fréquence est recombinée pour former un signal audio codé (34). Ces étapes sont répétées séquentiellement pour chaque bit ultérieur des données. L'invention concerne également un procédé d'extraction des données à partir d'un tel signal audio codé.
PCT/AU2007/001532 2006-10-12 2007-10-10 Procédé et système pour coder des données dans un signal audio Ceased WO2008043140A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
AU2006905656A AU2006905656A0 (en) 2006-10-12 Method and System for Encoding Data into an Audio Signal
AU2006905656 2006-10-12

Publications (1)

Publication Number Publication Date
WO2008043140A1 true WO2008043140A1 (fr) 2008-04-17

Family

ID=39282344

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/AU2007/001532 Ceased WO2008043140A1 (fr) 2006-10-12 2007-10-10 Procédé et système pour coder des données dans un signal audio

Country Status (1)

Country Link
WO (1) WO2008043140A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000059148A1 (fr) * 1999-03-29 2000-10-05 Markany Inc. Appareil et methode de façonnage en filigrane numerique
WO2001057868A1 (fr) * 2000-02-01 2001-08-09 Koninklijke Philips Electronics N.V. Integration de filigrane dans un signal d'information
WO2005034398A2 (fr) * 2003-06-19 2005-04-14 University Of Rochester Camouflage de donnees par manipulation de phase de signaux audio
EP1764780A1 (fr) * 2005-09-16 2007-03-21 Deutsche Thomson-Brandt Gmbh Filigranage aveugle de signaux audio en utilisant des variations de la phase

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000059148A1 (fr) * 1999-03-29 2000-10-05 Markany Inc. Appareil et methode de façonnage en filigrane numerique
WO2001057868A1 (fr) * 2000-02-01 2001-08-09 Koninklijke Philips Electronics N.V. Integration de filigrane dans un signal d'information
WO2005034398A2 (fr) * 2003-06-19 2005-04-14 University Of Rochester Camouflage de donnees par manipulation de phase de signaux audio
EP1764780A1 (fr) * 2005-09-16 2007-03-21 Deutsche Thomson-Brandt Gmbh Filigranage aveugle de signaux audio en utilisant des variations de la phase

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ANSARI R. ET AL.: "DATA-HIDING IN AUDIO USING FREQUENCY-SELECTIVE PHASE ALTERATION", INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, vol. 5, 17 May 2004 (2004-05-17), pages V-389 *

Similar Documents

Publication Publication Date Title
US11557304B2 (en) Methods and apparatus for performing variable block length watermarking of media
DE69230760T2 (de) Integrierte signalisierung
JP4807405B2 (ja) アナログ信号への情報の埋込みおよび抽出を分布信号特徴を用いて行なう装置および方法
DE69632340T2 (de) Transport von versteckten daten nach komprimierung
EP1002388B1 (fr) Systeme et procede d'integration ou d'extraction de donnees dans des signaux analogiques au moyen de caracteristiques de signal distribuees
AU2009308305B2 (en) Methods and apparatus to perform audio watermarking and watermark detection and extraction
AU2011201838B2 (en) Methods, Apparatus and Articles of Manufacture to Perform Audio Watermark Decoding
US20180286417A1 (en) Audio watermarking via phase modification
US8116514B2 (en) Water mark embedding and extraction
US20030063747A1 (en) Apparatus and method for embedding and extracting information in analog signals using distributed signal features and replica modulation
AU1308999A (en) System and method for encoding an audio signal, by adding an inaudible code to the audio signal, for use in broadcast programme identification systems
MXPA06012550A (es) Incrustacion de filigrana digital.
WO2002049363A1 (fr) Procede et systeme de filigranage numerique pour contenu audio compresse
JP2014521112A (ja) 入力信号に透かし入れするための量子化インデックス変調の方法および装置
WO2008043140A1 (fr) Procédé et système pour coder des données dans un signal audio
Hu et al. The use of spectral shaping to extend the capacity for dwt-based blind audio watermarking
Wei et al. Audio watermarking using time-frequency compression expansion
Wang et al. Data hiding in digital audio by frequency domain dithering
US6754633B1 (en) Encoding a code signal into an audio or video signal
US20060167682A1 (en) Adaptive and progressive audio stream descrambling
US7668205B2 (en) Method, system and program product for the insertion and retrieval of identifying artifacts in transmitted lossy and lossless data
Blackledge et al. Audio data verification and authentication using frequency modulation based watermarking
Morimoto Techniques for data hiding in audio files
HK1103320B (zh) 水印嵌入
HK1207200A1 (en) Methods and apparatus to perform audio watermarking and watermark detection and extraction

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07815337

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07815337

Country of ref document: EP

Kind code of ref document: A1