CN112420023B - Music infringement detection method - Google Patents
Music infringement detection method Download PDFInfo
- Publication number
- CN112420023B CN112420023B CN202011352226.XA CN202011352226A CN112420023B CN 112420023 B CN112420023 B CN 112420023B CN 202011352226 A CN202011352226 A CN 202011352226A CN 112420023 B CN112420023 B CN 112420023B
- Authority
- CN
- China
- Prior art keywords
- music
- vectors
- frequency spectrum
- library
- spectrum signal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/10—Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
- G06F21/12—Protecting executable software
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Technology Law (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a music infringement detection method, which comprises the following steps: s1: sequentially carrying out short-time Fourier transform on each piece of music in the music library to obtain a frequency spectrum signal corresponding to the music ID; s2: performing dynamic resolution compression on the frequency spectrum signal; s3: calculating an extreme point of each frequency band interval according to the compressed frequency spectrum signal; s4: filtering the extreme points, and subtracting every two extreme points to obtain music vectors of the music library; s5: compressing music vectors of a music library into int32 according to bits; s6: establishing a hash table by taking int32 as Key and music ID as Value, wherein the music ID progressively marks each piece of music according to the time sequence of music entering a music library; s7: inputting training audio to obtain infringement probability; s8: and inputting test audio to obtain infringement probability. The method and the device perform feature extraction on the frequency spectrum information of the music through the convolutional neural network and the full-connection network, can extract useful features in multiple dimensions, do not need manual screening, and improve the accuracy and the efficiency of detection.
Description
Technical Field
The invention belongs to the technical field of data processing, and particularly relates to a music infringement detection method.
Background
The popularization of the internet also brings the wide popularity of music at present, and people can conveniently listen to and use the music in various modes to create videos. But the music is copyrighted and if it is randomly used for commercial video it will cause infringement problems, compromising the rights and interests of the music creators.
A patent with a patent publication number of CN101493918A discloses a music infringing method, which specifically comprises the following steps:
the invention relates to an online music piracy monitoring method and a system, which sequentially comprise the following steps: the audio fingerprint extraction module acquires an audio download address from the Internet: the audio fingerprint extraction module reads an audio file from the audio download address, and the audio file is processed to obtain an audio fingerprint: the monitoring analysis module compares the audio fingerprint with the audio fingerprint of the genuine audio file: if the comparison result is larger than the set wide value, the infringement positioning module further acquires the information of the suspected infringement person and sends out a warning to the suspected infringement person. Compared with the prior art, the invention has the technical effects that: through the technical means of network spiders, audio fingerprint extraction, feature code extraction and the like, the network digital music resources are effectively monitored, and the evidence and the warning of the infringement behavior are obtained, so that the whole process is automatic, the cost is greatly saved, the time is saved, and the timeliness of the right maintenance is ensured.
Although the above patent can judge the music infringement, the accuracy and performance of infringement detection cannot be guaranteed.
Disclosure of Invention
In order to solve the problems, the invention provides a music infringement detection method which can greatly improve the accuracy and performance of infringement detection based on deep learning on the basis of judging infringement.
The technical scheme of the invention is as follows:
a music infringement detection method comprises the following steps:
s1: carrying out short-time Fourier transform on music in a music library to obtain a frequency spectrum signal;
s2: performing dynamic resolution compression on the frequency spectrum signal;
s3: calculating an extreme point of each frequency band interval according to the compressed frequency spectrum signal;
s4: filtering the extreme points, and subtracting every two extreme points to obtain music vectors of the music library;
s5: compressing music vectors of a music library into int32 according to bits;
s6: repeating the steps S1-S5 aiming at all music in the music library, constructing a hash table by taking int32 as Key and music ID as Value, wherein the music ID progressively marks each piece of music according to the time sequence of the music entering the music library;
s7: inputting training audio, acquiring a frequency spectrum signal of the training audio by using short-time Fourier transform, repeating the steps S2-S5, acquiring training music vectors, colliding with a hash table containing all music library music vectors, sequencing according to the time of collision, calculating the Euclidean distance between the two vectors, and normalizing to obtain infringement probability;
s8: inputting a test audio, acquiring a frequency spectrum signal of the test audio by using short-time Fourier transform, repeating the steps S2-S5, acquiring a test music vector, colliding with a hash table containing all music library music vectors, sequencing according to the time of collision, calculating the Euclidean distance between the two vectors, and normalizing to obtain the infringement probability.
Preferably, the specific process of the dynamic resolution compression in step S2 is as follows:
s2.1: forming a spectrogram by using the input Fourier transformed spectrum signal;
s2.2: vertically and uniformly dividing the spectrogram into a plurality of regions;
s2.3: performing feature extraction on the region in the step S2.2 through a convolutional neural network;
s2.4: judging whether the region belongs to useful features or not, and rejecting partial regions not containing the useful features;
s2.5: and splicing the rest regions into a new spectrogram again.
Preferably, the convolutional neural network comprises six convolutional layers and three fully-connected layers, the convolutional layers comprise eight 1 × 1 convolutional kernels, two layers of the fully-connected layers comprise 1024 neurons, and one layer comprises 2 neurons.
Preferably, the calculation of the extreme point in step S3 is to find a maximum value and a minimum value, and the calculation formula of the maximum value is:the calculation formula of the minimum value is as follows:。
preferably, the filtering step in step S4 includes:
s4.1: screening all extreme points through a multilayer fully-connected neural network;
s4.2: eliminating the extreme point which is output as 0 after passing through the multilayer fully-connected neural network, and reserving the extreme point which is output as 1;
s4.3: and splicing the residual extreme points of different frequency bands and outputting.
Preferably, the fully-connected neural network comprises three layers, wherein each of the first and second layers comprises 1024 neurons, and the third layer comprises 2 neurons.
The invention also provides a computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the music infringement detection method when executing the computer program.
The present invention also provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the music infringement detection method.
The invention has the beneficial effects that: the method and the device perform feature extraction on the frequency spectrum information of the music through the convolutional neural network and the full-connection network, can extract useful features in multiple dimensions, do not need manual screening, and improve the accuracy and the efficiency of detection.
Drawings
Fig. 1 is a flowchart of a method provided in an embodiment of the present invention.
Fig. 2 is a detailed flowchart of dynamic resolution compression.
FIG. 3 is a flow chart of extreme point filtering.
Detailed Description
The embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
As shown in fig. 1, an embodiment of the present invention provides a music infringement detection method, which includes the following specific steps:
1. and carrying out short-time Fourier transform on the music in the music library to obtain a spectrogram.
The method comprises the steps of adding a sliding time window to an input music signal by utilizing short-time Fourier change, and carrying out Fourier transformation on the signal in the window to obtain time-varying frequency spectrum processing of the signal so as to convert a time domain signal into a frequency domain signal.
2. And performing dynamic resolution compression on the acquired spectrogram.
3. And calculating all extreme points of each frequency band interval.
4. Filtering the extreme points.
5. And obtaining vectors by pairwise subtraction of the extreme points.
In the step 3-5, a piece of music corresponds to a vector, specifically, a minimum value is subtracted from a maximum value of each frequency band to obtain a number, and then the obtained numbers of each frequency band are combined to obtain the vector.
6. Repeating the steps 1-5 for each piece of music in the music library to obtain vectors corresponding to all pieces of music, compressing the vectors into int32 according to bits, and constructing a HashTable by taking int32 as Key and music ID as Value.
7. Inputting training audio, repeating the steps 2-5 to obtain vectors, colliding with the HashTable, sequencing the collision of each piece of music according to time, calculating infringement probability, and comparing infringement results with the labels to obtain a training model.
8. Inputting test audio, repeating the steps 2-5 to obtain vectors, colliding with HashTable, sequencing the collision of each piece of music according to time, calculating infringement probability, and outputting whether infringement exists or not.
As an embodiment of the present invention, as shown in fig. 2, the specific process in step 2 is:
2.1, inputting a spectrogram.
2.2, vertically dividing the spectrogram into a plurality of regions, wherein the regions are divided into 256 regions in the embodiment.
And 2.3, extracting the features of the region in the step S2.2 through a convolutional neural network.
And 2.4, judging whether the area belongs to the useful features or not, and rejecting the partial area not containing the useful features.
And 2.5, splicing the rest areas into a new spectrogram again.
As an embodiment of the present invention, the calculation of the extreme point in step 3 is to find a maximum value and a minimum value, and the maximum value is calculated by the following formula:the calculation formula of the minimum value is as follows:。
as an embodiment of the present invention, the convolutional neural network includes six convolutional layers and three fully-connected layers, wherein the convolutional layers include eight convolutional cores of 1 × 1, two layers of the fully-connected layers include 1024 neurons, and one layer includes 2 neurons.
As an embodiment of the present invention, as shown in fig. 3, the specific process of filtering in step 4 is:
4.1, inputting an extreme point;
4.2, through the full-connection neural network;
4.3, outputting whether an extreme point is reserved;
and 4.4, splicing and outputting the residual extreme points.
As an embodiment of the present invention, the fully-connected neural network comprises three layers, wherein the first layer and the second layer each comprise 1024 neurons, and the third layer comprises 2 neurons.
The invention also provides a computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the music infringement detection method when executing the computer program.
The present invention also provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the music infringement detection method.
The practical examples of the method are as follows: the music library has music A, B, C, D, music IDs are 1, 2, 3 and 4 respectively, for example, A is firstly carried out short-time Fourier transform on A to obtain frequency spectrum signals [1.6393873e-05, -2.2720376e-05, -1.9727035e-05,.;, 0.0000000e +00, 0.0000000e +00, 0.0000000e +00], then dynamic resolution compression is carried out on the obtained frequency spectrum signals by a convolutional neural network to obtain [1.6393873e-05, -1.3622489e-05, -3.8468256e-05,.;, 1.4637652e-05, -2.58741654e-05, -1.8945687ee-05], then maximum values and minimum values are found out from the frequency spectrum signals according to regions, and then extreme value filtering is carried out by the neural network to obtain a maximum value sequence [1.6393873e-05, 2.9647521e-05,;. 3.7123548,.; 1.9647581e-05 ], 2.4874165e-05, 1.5512479e-05] and a minimum sequence [ -1.3222547e-05, -1.39852657e-05, -3.7988510e-05,.;, -1.3347891e-05, -2.6955249e-05, -2.58741654e-05], and then subtracting the two sequences to obtain a vector of music library music A [2.96164200e-05, 4.36327867e-05, 3.71239279e +00,;. 3.29954720e-05, 5.18294140e-05, 4.13866444e-05 ]. BCD is also performed as above, resulting in the corresponding vector. Then compressing the data in the vector into int32 according to bits, and constructing a hash table as follows:
Key | (music A corresponding vector) | (vector corresponding to music B) | (music C corresponding vector) | (vector corresponding to music D) |
Value | 1 | 2 | 3 | 4 |
Then, the vector of the training music T is calculated according to the method, the training music T collides with the upper table, namely the music vector of the library is compared with the T in pairs, the Euclidean distance is calculated, and then collision results (infringement probability) are sequenced according to the time when the collision occurs, so that the infringement probability of each music in the library of the T is obtained.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the present invention in its spirit and scope. Are intended to be covered by the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (5)
1. A music infringement detection method is characterized by comprising the following steps:
s1: carrying out short-time Fourier transform on music in a music library to obtain a frequency spectrum signal;
s2: performing dynamic resolution compression on the frequency spectrum signal;
s3: calculating an extreme point of each frequency band interval according to the compressed frequency spectrum signal;
s4: filtering the extreme points, and subtracting every two extreme points to obtain music vectors of the music library;
s5: compressing music vectors of a music library into int32 according to bits;
s6: repeating the steps S1-S5 aiming at all music in the music library, constructing a hash table by taking int32 as Key and music ID as Value, wherein the music ID progressively marks each piece of music according to the time sequence of the music entering the music library;
s7: inputting training audio, acquiring a frequency spectrum signal of the training audio by using short-time Fourier transform, repeating the steps S2-S5, acquiring training music vectors, colliding with a hash table containing all music library music vectors, sequencing according to the time of collision, calculating the Euclidean distance between the two vectors, and normalizing to obtain infringement probability;
s8: inputting a test audio, acquiring a frequency spectrum signal of the test audio by using short-time Fourier transform, repeating the steps S2-S5 to acquire a test music vector, colliding with a hash table containing all music library music vectors, sequencing according to the time of collision, calculating the Euclidean distance between the two vectors, and normalizing to obtain infringement probability; the specific process of dynamic resolution compression in step S2 is:
s2.1: forming a spectrogram by using the input Fourier transformed spectrum signal;
s2.2: vertically and uniformly dividing the spectrogram into a plurality of regions;
s2.3: performing feature extraction on the region in the step S2.2 through a convolutional neural network;
s2.4: judging whether the region belongs to useful features or not, and rejecting partial regions not containing the useful features;
s2.5: splicing the rest areas into a new spectrogram again;
the convolutional neural network comprises six convolutional layers and three fully-connected layers, wherein the convolutional layers comprise eight 1x1 convolutional kernels, two layers of the fully-connected layers comprise 1024 neurons, and one layer comprises 2 neurons.
3. the music infringement detection method of claim 1, wherein the filtering step in step S4 includes:
s4.1: screening all extreme points through a multilayer fully-connected neural network;
s4.2: eliminating the extreme point which is output as 0 after passing through the multilayer fully-connected neural network, and reserving the extreme point which is output as 1;
s4.3: and splicing the residual extreme points of different frequency bands and outputting.
4. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 3 when executing the computer program.
5. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011352226.XA CN112420023B (en) | 2020-11-26 | 2020-11-26 | Music infringement detection method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011352226.XA CN112420023B (en) | 2020-11-26 | 2020-11-26 | Music infringement detection method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112420023A CN112420023A (en) | 2021-02-26 |
CN112420023B true CN112420023B (en) | 2022-03-25 |
Family
ID=74843766
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011352226.XA Active CN112420023B (en) | 2020-11-26 | 2020-11-26 | Music infringement detection method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112420023B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101014953A (en) * | 2003-09-23 | 2007-08-08 | 音乐Ip公司 | Audio fingerprinting system and method |
CN101493918A (en) * | 2008-10-21 | 2009-07-29 | 深圳市牧笛科技有限公司 | On-line music pirate monitoring method and system |
CN104567674A (en) * | 2014-12-29 | 2015-04-29 | 北京理工大学 | Bilateral fitting confocal measuring method |
CN108899037A (en) * | 2018-07-05 | 2018-11-27 | 平安科技(深圳)有限公司 | Animal vocal print feature extracting method, device and electronic equipment |
CN109918539A (en) * | 2019-02-28 | 2019-06-21 | 华南理工大学 | A method for mutual retrieval of audio and video based on user click behavior |
CN111652177A (en) * | 2020-06-12 | 2020-09-11 | 中国计量大学 | Signal feature extraction method based on deep learning |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190385610A1 (en) * | 2017-12-08 | 2019-12-19 | Veritone, Inc. | Methods and systems for transcription |
EP3608918B1 (en) * | 2018-08-08 | 2024-05-22 | Tata Consultancy Services Limited | Parallel implementation of deep neural networks for classifying heart sound signals |
-
2020
- 2020-11-26 CN CN202011352226.XA patent/CN112420023B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101014953A (en) * | 2003-09-23 | 2007-08-08 | 音乐Ip公司 | Audio fingerprinting system and method |
CN101493918A (en) * | 2008-10-21 | 2009-07-29 | 深圳市牧笛科技有限公司 | On-line music pirate monitoring method and system |
CN104567674A (en) * | 2014-12-29 | 2015-04-29 | 北京理工大学 | Bilateral fitting confocal measuring method |
CN108899037A (en) * | 2018-07-05 | 2018-11-27 | 平安科技(深圳)有限公司 | Animal vocal print feature extracting method, device and electronic equipment |
CN109918539A (en) * | 2019-02-28 | 2019-06-21 | 华南理工大学 | A method for mutual retrieval of audio and video based on user click behavior |
CN111652177A (en) * | 2020-06-12 | 2020-09-11 | 中国计量大学 | Signal feature extraction method based on deep learning |
Non-Patent Citations (1)
Title |
---|
多媒体感知哈希算法及应用研究;赵玉鑫;《硕士学位论文》;20101231;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112420023A (en) | 2021-02-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106778241B (en) | Malicious file identification method and device | |
CN113257255B (en) | Method and device for identifying forged voice, electronic equipment and storage medium | |
CN112632609A (en) | Abnormality detection method, abnormality detection device, electronic apparatus, and storage medium | |
KR101841985B1 (en) | Method and Apparatus for Extracting Audio Fingerprint | |
CN111652875A (en) | A video forgery detection method, system, storage medium, and video surveillance terminal | |
CN114266740A (en) | Quality inspection method, device, equipment and storage medium for traditional Chinese medicine decoction pieces | |
CN106845516A (en) | A Footprint Image Recognition Method Based on Multi-Sample Joint Representation | |
CN114168788A (en) | Audio audit processing method, device, equipment and storage medium | |
CN110942034A (en) | Method, system and device for detecting multi-type depth network generated image | |
CN117892125A (en) | Multi-class unbalanced network traffic data enhancement method based on improved generation of countermeasure network | |
CN114140670A (en) | Method and device for model ownership verification based on exogenous features | |
CN112420023B (en) | Music infringement detection method | |
CN119865263A (en) | Method, device, equipment, storage medium and product for processing interference component | |
CN106663102B (en) | Method and apparatus for generating a fingerprint of an information signal | |
CN117792737B (en) | Network intrusion detection method, device, electronic equipment and storage medium | |
CN113113051A (en) | Audio fingerprint extraction method and device, computer equipment and storage medium | |
CN111581640A (en) | Malicious software detection method, device and equipment and storage medium | |
CN116662186A (en) | Log playback assertion method and device based on logistic regression and electronic equipment | |
CN116703599A (en) | Transaction method based on cluster analysis and unsupervised learning algorithm and related products | |
CN112749391A (en) | Detection method and device for malicious software countermeasure sample and electronic equipment | |
KR101841983B1 (en) | Method and Apparatus for Identifying Audio Fingerprint | |
CN114327978B (en) | System fault mode identification method and system based on moment variable | |
CN114169415B (en) | System fault mode identification method and system | |
Yu et al. | Cumulant-based image fingerprints | |
CN112330632B (en) | Digital photo camera fingerprint attack detection method based on countermeasure generation network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |