CN111552832A - Risk user identification method and device based on voiceprint features and associated graph data - Google Patents
Risk user identification method and device based on voiceprint features and associated graph data Download PDFInfo
- Publication number
- CN111552832A CN111552832A CN202010253799.0A CN202010253799A CN111552832A CN 111552832 A CN111552832 A CN 111552832A CN 202010253799 A CN202010253799 A CN 202010253799A CN 111552832 A CN111552832 A CN 111552832A
- Authority
- CN
- China
- Prior art keywords
- voiceprint
- feature
- voice information
- user
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/68—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/683—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/03—Credit; Loans; Processing thereof
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Artificial Intelligence (AREA)
- Acoustics & Sound (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Business, Economics & Management (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Signal Processing (AREA)
- Library & Information Science (AREA)
- Quality & Reliability (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- Databases & Information Systems (AREA)
- Technology Law (AREA)
- Economics (AREA)
- Animal Behavior & Ethology (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Development Economics (AREA)
- Telephonic Communication Services (AREA)
Abstract
本发明涉及一种人工智能技术,揭露了一种基于声纹特征与关联图谱数据的风险用户识别方法,包括:获取用户的标准语音信息;提取标准语音信息的第一声纹特征;将第一声纹特征输入至预设关联图谱模型,得到与第一声纹特征相关的关联图谱数据;将关联图谱数据向量化,得到关联特征向量;若预设黑声纹库中存在与第一声纹特征相匹配的声纹特征,或者预设黑关系图谱中存在与关联特征向量相匹配的标签特征向量,确定用户为风险用户。本发明还提出一种基于声纹特征与关联图谱数据的风险用户识别装置、电子设备以及一种计算机可读存储介质。本发明可以降低识别风险用户的漏检率,有利于增强信息的安全性。
The invention relates to an artificial intelligence technology, and discloses a risk user identification method based on voiceprint features and associated graph data, comprising: acquiring standard voice information of users; extracting first voiceprint features of the standard voice information; The voiceprint feature is input into the preset correlation map model to obtain the correlation map data related to the first voiceprint feature; the correlation map data is vectorized to obtain the correlation feature vector; if there is a relationship with the first voiceprint in the preset black voiceprint database A voiceprint feature that matches the feature, or a label feature vector matching the associated feature vector exists in the preset black relational graph, and the user is determined to be a risk user. The present invention also provides a risk user identification device, an electronic device, and a computer-readable storage medium based on voiceprint features and associated graph data. The present invention can reduce the missed detection rate of identifying risk users, and is beneficial to enhance the security of information.
Description
技术领域technical field
本发明涉及人工智能技术领域,尤其涉及一种基于声纹特征与关联图谱数据的风险用户识别方法、装置、电子设备及计算机可读存储介质。The present invention relates to the technical field of artificial intelligence, and in particular, to a risk user identification method, device, electronic device and computer-readable storage medium based on voiceprint features and associated graph data.
背景技术Background technique
目前信息数据呈现指数型的增加,伴随着信息数据的增加,对用户信息进行安全性验证从而识别潜在的风险用户具有一定必要性。现有技术中,主要基于单项验证方法对用户信息进行验证进而识别风险用户,这种方式存在安全漏洞,容易漏检,用户信息容易被盗用信息者盗用。At present, information data is increasing exponentially. With the increase of information data, it is necessary to verify the security of user information so as to identify potential risk users. In the prior art, user information is mainly verified based on a single-item verification method to identify risky users. This method has security loopholes, is easy to be missed, and user information is easily stolen by information thieves.
发明内容SUMMARY OF THE INVENTION
本发明提供一种基于声纹特征与关联图谱数据的风险用户识别方法、装置、电子设备及计算机可读存储介质,其主要目的在于降低识别风险用户的漏检率,有利于增强信息的安全性。The present invention provides a method, device, electronic device and computer-readable storage medium for identifying risk users based on voiceprint features and associated graph data, the main purpose of which is to reduce the missed detection rate of identifying risk users, and is conducive to enhancing information security .
为实现上述目的,本发明提供的一种基于声纹特征与关联图谱数据的风险用户识别方法,包括:To achieve the above purpose, the present invention provides a method for identifying risk users based on voiceprint features and associated graph data, including:
获取用户的标准语音信息;Obtain the user's standard voice information;
提取所述标准语音信息的第一声纹特征;extracting the first voiceprint feature of the standard voice information;
将所述第一声纹特征输入至预设关联图谱模型,得到与所述第一声纹特征相关的关联图谱数据;Inputting the first voiceprint feature into a preset association atlas model to obtain association atlas data related to the first voiceprint feature;
将所述关联图谱数据向量化,得到关联特征向量;Vectorizing the association map data to obtain an association feature vector;
判断预设黑声纹库中是否存在与所述第一声纹特征相匹配的声纹特征;以及Judging whether there is a voiceprint feature matching the first voiceprint feature in the preset black voiceprint library; and
判断预设黑关系图谱中是否存在与所述关联特征向量相匹配的标签特征向量;Judging whether there is a label feature vector that matches the associated feature vector in the preset black relational graph;
若所述预设黑声纹库中存在与所述第一声纹特征相匹配的声纹特征,或者所述预设黑关系图谱中存在与所述关联特征向量相匹配的标签特征向量,确定所述用户为风险用户。If there is a voiceprint feature matching the first voiceprint feature in the preset black voiceprint library, or if there is a label feature vector matching the associated feature vector in the preset black relationship map, determine The user is a risk user.
可选地,所述获取用户的标准语音信息包括:Optionally, the obtaining the standard voice information of the user includes:
获取所述用户的原始语音信息;obtain the original voice information of the user;
利用模/数转换器对所述原始语音信息进行采样,得到数字语音信号;Utilize an analog/digital converter to sample the original voice information to obtain a digital voice signal;
对所述数字语音信号进行预加重操作,得到数字滤波语音信号;performing a pre-emphasis operation on the digital voice signal to obtain a digital filtered voice signal;
对所述数字滤波语音信号进行分帧加窗操作,得到所述标准语音信息。A frame-by-frame windowing operation is performed on the digitally filtered voice signal to obtain the standard voice information.
可选地,所述对所述数字滤波语音信号进行分帧加窗操作包括:Optionally, the described framing and windowing operation on the digitally filtered speech signal includes:
通过目标函数对所述数字滤波语音信号进行分帧加窗操作,所述目标函数为:The frame-by-frame windowing operation is performed on the digitally filtered speech signal by an objective function, and the objective function is:
其中,n为所述数字滤波语音信号的帧数序列,N为所述数字滤波语音信号的总帧数,w(n)为所述标准语音信息的单帧数据。Wherein, n is the frame number sequence of the digitally filtered voice signal, N is the total number of frames of the digitally filtered voice signal, and w(n) is the single frame data of the standard voice information.
可选地,所述提取所述标准语音信息的第一声纹特征,包括:Optionally, the extracting the first voiceprint feature of the standard voice information includes:
将所述标准语音信息进行离散傅里叶变换,得到所述标准语音信息的频谱信息;The standard voice information is subjected to discrete Fourier transform to obtain spectrum information of the standard voice information;
利用三角滤波器对所述标准语音信息进行三角滤波计算,得到所述标准语音信息的频率响应值;The triangular filter is used to perform triangular filtering calculation on the standard speech information to obtain the frequency response value of the standard speech information;
对所述频谱信息和所述频率响应值进行对数计算,得到对数能量;Perform logarithmic calculation on the spectrum information and the frequency response value to obtain logarithmic energy;
对所述对数能量进行离散余弦计算,得到所述第一声纹特征。Perform discrete cosine calculation on the logarithmic energy to obtain the first voiceprint feature.
可选地,所述离散傅里叶变换包含的计算函数为:Optionally, the calculation function included in the discrete Fourier transform is:
其中,N为所述数字滤波语音信号的总帧数,n为所述数字滤波语音信号的帧数序列,w(n)为所述标准语音信息的单帧数据,j为所述傅里叶变换的权值,k为所述数字滤波语音信号中单帧的声音频率,D为所述频谱信息。Wherein, N is the total number of frames of the digitally filtered voice signal, n is the frame number sequence of the digitally filtered voice signal, w(n) is the single frame data of the standard voice information, and j is the Fourier transform The weight of the transformation, k is the sound frequency of a single frame in the digitally filtered speech signal, and D is the spectrum information.
可选地,所述判断预设黑声纹库中是否存在与所述第一声纹特征相匹配的声纹特征包括:Optionally, the judging whether there is a voiceprint feature matching the first voiceprint feature in the preset black voiceprint library includes:
通过相似度函数分别计算所述第一声纹特征与预设黑声纹库中多个声纹特征的第一相似度;Calculate the first similarity between the first voiceprint feature and the plurality of voiceprint features in the preset black voiceprint library by using the similarity function;
若存在大于第一相似度阈值的第一相似度,确定所述预设黑声纹库中存在与所述第一声纹特征相匹配的声纹特征。If there is a first similarity greater than a first similarity threshold, it is determined that there is a voiceprint feature matching the first voiceprint feature in the preset black voiceprint library.
可选地,所述相似度函数为:Optionally, the similarity function is:
其中,x表示所述第一声纹特征,yi表示所述预设黑声纹库中声纹特征,n表示所述预设黑声纹库中声纹特征的数量,sim(x,yi)表示所述第一相似度。Wherein, x represents the first voiceprint feature, yi represents the voiceprint feature in the preset black voiceprint database, n represents the number of voiceprint features in the preset black voiceprint database, sim(x,y i ) represents the first similarity.
为了解决上述问题,本发明还提供一种基于声纹特征与关联图谱数据的风险用户识别装置,所述装置包括:In order to solve the above problems, the present invention also provides a risk user identification device based on voiceprint features and associated graph data, the device comprising:
语音信息获取模块,用于获取用户的标准语音信息;A voice information acquisition module, used to acquire the user's standard voice information;
声纹特征提取模块,用于提取所述标准语音信息的第一声纹特征;a voiceprint feature extraction module for extracting the first voiceprint feature of the standard voice information;
图谱数据获取模块,用于将所述第一声纹特征输入至预设关联图谱模型,得到与所述第一声纹特征相关的关联图谱数据;an atlas data acquisition module, configured to input the first voiceprint feature into a preset associated atlas model to obtain associated atlas data related to the first voiceprint feature;
向量转换模块,用于将所述关联图谱数据向量化,得到关联特征向量;A vector conversion module, for vectorizing the association map data to obtain an association feature vector;
判断模块,用于判断预设黑声纹库中是否存在与所述第一声纹特征相匹配的声纹特征;a judgment module for judging whether there is a voiceprint feature matching the first voiceprint feature in the preset black voiceprint library;
所述判断模块,还用于判断预设黑关系图谱中是否存在与所述关联特征向量相匹配的标签特征向量;The judging module is also used to judge whether there is a label feature vector that matches the associated feature vector in the preset black relational graph;
确定模块,用于若所述预设黑声纹库中存在与所述第一声纹特征相匹配的声纹特征,或者所述预设黑关系图谱中存在与所述关联特征向量相匹配的标签特征向量,确定所述用户为风险用户。A determination module, used for if there is a voiceprint feature matching the first voiceprint feature in the preset black voiceprint library, or if there is a voiceprint feature matching the associated feature vector in the preset black relational map The label feature vector determines that the user is a risk user.
为了解决上述问题,本发明还提供一种电子设备,所述电子设备包括:In order to solve the above problems, the present invention also provides an electronic device, the electronic device includes:
存储器,存储至少一个指令;及a memory that stores at least one instruction; and
处理器,执行所述存储器中存储的指令以实现上述中任意一项所述的基于声纹特征与关联图谱数据的风险用户识别方法。The processor executes the instructions stored in the memory to implement any one of the above-mentioned methods for identifying risk users based on voiceprint features and associated graph data.
为了解决上述问题,本发明还提供一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一个指令,所述至少一个指令被电子设备中的处理器执行以实现上述中任意一项所述的基于声纹特征与关联图谱数据的风险用户识别方法。In order to solve the above problems, the present invention also provides a computer-readable storage medium, where at least one instruction is stored in the computer-readable storage medium, and the at least one instruction is executed by a processor in an electronic device to implement any one of the above The method for identifying risk users based on voiceprint features and associated graph data described in Item 1.
本发明实施例中,获取用户的标准语音信息;提取所述标准语音信息的第一声纹特征;将所述第一声纹特征输入至预设关联图谱模型,得到与所述第一声纹特征相关的关联图谱数据;将所述关联图谱数据向量化,得到关联特征向量;判断预设黑声纹库中是否存在与所述第一声纹特征相匹配的声纹特征;以及判断预设黑关系图谱中是否存在与所述关联特征向量相匹配的标签特征向量;若所述预设黑声纹库中存在与所述第一声纹特征相匹配的声纹特征,或者所述预设黑关系图谱中存在与所述关联特征向量相匹配的标签特征向量,确定所述用户为风险用户。通过两种渠道的双项验证,实现了降低识别风险用户的漏检率,进而增强信息的安全性的目的。In the embodiment of the present invention, the standard voice information of the user is obtained; the first voiceprint feature of the standard voice information is extracted; feature-related association map data; vectorize the association map data to obtain an association feature vector; determine whether there is a voiceprint feature matching the first voiceprint feature in the preset black voiceprint library; and determine the preset Whether there is a label feature vector matching the associated feature vector in the black relational map; if there is a voiceprint feature matching the first voiceprint feature in the preset black voiceprint library, or the preset There is a label feature vector matching the associated feature vector in the black relationship graph, and the user is determined to be a risk user. Through the double-item verification of the two channels, the purpose of reducing the missed detection rate of identifying risk users and enhancing the security of information is achieved.
附图说明Description of drawings
图1为本发明一实施例提供的基于声纹特征与关联图谱数据的风险用户识别方法的流程示意图;1 is a schematic flowchart of a method for identifying risk users based on voiceprint features and associated graph data according to an embodiment of the present invention;
图2为本发明一实施例提供的基于声纹特征与关联图谱数据的风险用户识别装置的模块示意图;2 is a schematic block diagram of a device for identifying risk users based on voiceprint features and associated graph data according to an embodiment of the present invention;
图3为本发明一实施例提供的实现基于声纹特征与关联图谱数据的风险用户识别方法的电子设备的内部结构示意图;3 is a schematic diagram of the internal structure of an electronic device for implementing a method for identifying risk users based on voiceprint features and associated graph data according to an embodiment of the present invention;
本发明目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The realization, functional characteristics and advantages of the present invention will be further described with reference to the accompanying drawings in conjunction with the embodiments.
具体实施方式Detailed ways
应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.
本发明提供一种基于声纹特征与关联图谱数据的风险用户识别方法。参照图1所示,为本发明一实施例提供的基于声纹特征与关联图谱数据的风险用户识别方法的流程示意图。该方法可以由一个装置执行,该装置可以由软件和/或硬件实现。The invention provides a risk user identification method based on voiceprint features and associated graph data. Referring to FIG. 1 , it is a schematic flowchart of a method for identifying risk users based on voiceprint features and associated graph data according to an embodiment of the present invention. The method may be performed by an apparatus, which may be implemented in software and/or hardware.
在本实施例中,基于声纹特征与关联图谱数据的风险用户识别方法包括:In this embodiment, the method for identifying risk users based on voiceprint features and associated graph data includes:
S1、获取用户的标准语音信息。S1. Acquire the standard voice information of the user.
本实施例中,用户的标准语音信息可以是从语音数据库中获取到的。In this embodiment, the standard voice information of the user may be acquired from a voice database.
进一步地,所述获取用户的标准语音信息包括:Further, the obtaining the standard voice information of the user includes:
获取所述用户的原始语音信息;obtain the original voice information of the user;
利用模/数转换器对所述原始语音信息进行采样,得到数字语音信号;Utilize an analog/digital converter to sample the original voice information to obtain a digital voice signal;
对所述数字语音信号进行预加重操作,得到数字滤波语音信号;performing a pre-emphasis operation on the digital voice signal to obtain a digital filtered voice signal;
对所述数字滤波语音信号进行分帧加窗操作,得到所述标准语音信息。A frame-by-frame windowing operation is performed on the digitally filtered voice signal to obtain the standard voice information.
本实施例中,用户的原始语音信息为包含用户语音的音频信息,原始语音可以是与用户的语音通话中获取到的语音信息。In this embodiment, the user's original voice information is audio information including the user's voice, and the original voice may be voice information obtained during a voice call with the user.
例如,银行贷款人员在对贷款用户进行电话信用审核时,获取贷款审核人员与贷款用户之间的通话语音的录音,该录音为原始语音信息。For example, when a bank loan officer checks a loan user's phone credit, he/she obtains a voice recording of the conversation between the loan reviewer and the loan user, and the voice recording is the original voice information.
详细地,对原始语音信息进行采样是为了将原始语音信息转化为数字信号,便于对语音信息进行处理。In detail, the purpose of sampling the original voice information is to convert the original voice information into a digital signal, so as to facilitate the processing of the voice information.
本实施例中,使用模/数转换器以每秒上万次的速率对原始语音信息进行采样,每一次采样都记录下了原始语音信息在某一时刻的状态,从而能够得到不同时刻的语音数字语音信号。In this embodiment, an analog-to-digital converter is used to sample the original voice information at a rate of tens of thousands of times per second, and each sampling records the state of the original voice information at a certain time, so that the voice at different times can be obtained. digital voice signal.
由于人声的发音系统会抑制高频部分,在本实施例中,通过上述方式进行预加重操作,可以提升高频部分能量,使高频部分的语音能量和低频部分的语音能量有相似的幅度,使信号的频谱变得平坦,保持在低频到高频的整个频带中能用同样的信噪比。Since the pronunciation system of the human voice will suppress the high-frequency part, in this embodiment, the pre-emphasis operation is performed in the above-mentioned manner to increase the energy of the high-frequency part, so that the speech energy of the high-frequency part and the speech energy of the low-frequency part have a similar amplitude , flatten the spectrum of the signal and keep the same signal-to-noise ratio in the entire frequency band from low frequency to high frequency.
在本实施例中,预加重操作可以对数字语音信号进行补偿。In this embodiment, the pre-emphasis operation can compensate the digital speech signal.
具体地,所述预加重操作可通过y(t)=x(t)-μx(t-1)进行计算,其中,x(t)为数字语音信号,t为时间,y(t)为所述数字滤波语音信号,μ为所述预加重操作的调节值,μ的取值范围为[0.9,1.0]。Specifically, the pre-emphasis operation can be calculated by y(t)=x(t)-μx(t-1), where x(t) is the digital voice signal, t is time, and y(t) is the The digitally filtered speech signal, μ is the adjustment value of the pre-emphasis operation, and the value range of μ is [0.9, 1.0].
本实施例中,分帧加窗操作是为了去除所述数字滤波语音信号中的语音的重叠部分。In this embodiment, the frame-by-frame windowing operation is to remove the overlapping part of the speech in the digitally filtered speech signal.
例如:银行放贷人员在对贷款用户进行电话信用审核时,由于原始语音信息会有银行放贷人员与贷款用户的语音重叠部分,因此采用所述分帧加窗操作可去除银行放贷人员的语音,保留贷款用户的语音。For example, when a bank lender conducts a telephone credit review of a loan user, since the original voice information will have overlapping parts of the voice of the bank lender and the loan user, the framed windowing operation can remove the bank lender's voice and retain the voice of the bank lender and the loan user. The voice of the loan user.
进一步地,所述对所述数字滤波语音信号进行分帧加窗操作包括:Further, the described framing and windowing operation to the digitally filtered speech signal includes:
通过目标函数对所述数字滤波语音信号进行分帧加窗操作,所述目标函数为:The frame-by-frame windowing operation is performed on the digitally filtered speech signal by an objective function, and the objective function is:
其中,n为数字滤波语音信号的帧数序列,N为所述数字滤波语音信号的总帧数,w(n)为标准语音信息的单帧数据,即w(n)表示每一帧的标准语音信息。Among them, n is the frame number sequence of the digitally filtered voice signal, N is the total number of frames of the digitally filtered voice signal, w(n) is the single frame data of the standard voice information, that is, w(n) represents the standard of each frame voice message.
S2、提取所述标准语音信息的第一声纹特征。S2. Extract the first voiceprint feature of the standard voice information.
详细地,提取所述标准语音信息的第一声纹特征,包括:In detail, extracting the first voiceprint feature of the standard voice information includes:
将所述标准语音信息进行离散傅里叶变换,得到所述标准语音信息的频谱信息;The standard voice information is subjected to discrete Fourier transform to obtain spectrum information of the standard voice information;
利用三角滤波器对所述标准语音信息进行三角滤波计算,得到所述标准语音信息的频率响应值;The triangular filter is used to perform triangular filtering calculation on the standard speech information to obtain the frequency response value of the standard speech information;
对所述频谱信息和所述频率响应值进行对数计算,得到对数能量;Perform logarithmic calculation on the spectrum information and the frequency response value to obtain logarithmic energy;
对所述对数能量进行离散余弦计算,得到所述第一声纹特征。Perform discrete cosine calculation on the logarithmic energy to obtain the first voiceprint feature.
较佳地,所述离散傅里叶变换包含的计算函数为:Preferably, the calculation function included in the discrete Fourier transform is:
其中,N为所述数字滤波语音信号的总帧数,n为所述数字滤波语音信号的帧数序列,w(n)为标准语音信息的单帧数据,即w(n)表示每一帧的标准语音信息,j为所述傅里叶变换的权值,k为所述数字滤波语音信号中单帧的声音频率,D为频谱信息。Wherein, N is the total number of frames of the digitally filtered voice signal, n is the frame number sequence of the digitally filtered voice signal, w(n) is the single frame data of standard voice information, that is, w(n) represents each frame The standard speech information of , j is the weight of the Fourier transform, k is the sound frequency of a single frame in the digitally filtered speech signal, and D is the spectrum information.
优选地,在本实施例中,定义一个有M个滤波器(滤波器可以为三角滤波器)的滤波器组,滤波器的中心频率为f(i),i=1,2,…,M,所述中心频率为滤波器的截止频率,通过三角滤波器进行三角滤波计算。Preferably, in this embodiment, a filter bank with M filters (the filters can be triangular filters) is defined, and the center frequency of the filters is f(i), i=1, 2, . . . , M , the center frequency is the cutoff frequency of the filter, and the triangular filter is used to calculate the triangular filter.
由于三角滤波器可以对频谱进行平滑,并消除谐波的作用,突显声音的共振峰。因此一段声音的音调或音高,不会反应在声纹特征内,也就是说所述声纹特征并不会受到输入声音的音调不同而对识别结果有所影响。Since the triangular filter smoothes the spectrum and removes the effects of harmonics, the formants of the sound are highlighted. Therefore, the pitch or pitch of a piece of sound will not be reflected in the voiceprint feature, that is to say, the voiceprint feature will not be affected by the different pitches of the input sound and thus affect the recognition result.
优选的,所述三角滤波计算如下:Preferably, the triangular filtering is calculated as follows:
其中f(i)为三角滤波器的中心频率,i为三角滤波器的组别,H(k)为频率响应值,k为所述数字滤波语音信号中单帧的声音频率,即k可以表示每一帧的声音频率。where f(i) is the center frequency of the triangular filter, i is the group of the triangular filter, H(k) is the frequency response value, and k is the sound frequency of a single frame in the digitally filtered speech signal, that is, k can represent The sound frequency for each frame.
进一步地,对数变换是计算每个滤波器组输出的对数能量。Further, the logarithmic transformation is to calculate the logarithmic energy of each filter bank output.
一般人对声音声压的反应呈对数关系,人对高声压的细微变化敏感度不如低声压。因此,在本实施例中使用对数可以降低提取的特征对输入声音能量变化的敏感度。The average person's response to sound pressure is logarithmic, and people are less sensitive to subtle changes in high sound pressure than low sound pressure. Therefore, using logarithms in this embodiment can reduce the sensitivity of the extracted features to changes in the input sound energy.
具体地,可通过以下公式进行对数计算:Specifically, the logarithmic calculation can be performed by the following formula:
其中i为三角滤波器的组别,k为所述原始语音信息的单帧的声音频率,N为所述数字滤波语音信号的总帧数,n为所述数字滤波语音信号的帧数序列,D为频谱信息,S(i)为每个滤波器输出的对数能量。Wherein i is the group of the triangular filter, k is the sound frequency of the single frame of the original voice information, N is the total number of frames of the digitally filtered voice signal, n is the frame number sequence of the digitally filtered voice signal, D is the spectral information, and S(i) is the logarithmic energy output by each filter.
优选地,S(i)经过离散余弦变换得到声纹特征,所述离散余弦变换如下:Preferably, S(i) is subjected to discrete cosine transform to obtain voiceprint features, and the discrete cosine transform is as follows:
其中n为原始语音信息的帧数序列,i为三角滤波器的组别,M为三角滤波器的总组数,S(i)为每个滤波器输出的对数能量,x为所述声纹特征。where n is the sequence of frame numbers of the original speech information, i is the group of triangular filters, M is the total number of groups of triangular filters, S(i) is the logarithmic energy output by each filter, and x is the sound pattern features.
进一步地,在本发明的另一实施例中所述提取所述标准语音信息的第一特征包括:Further, in another embodiment of the present invention, the first feature of extracting the standard voice information includes:
利用LSTM(Long Short-Term Memory,长短期记忆)网络提取所述标准语音信息的第一特征。所述LSTM具有三个“门”结构,分别为忘记门(forget gate)、输入门(inputgate)、输出门(output gate),用于对输入的信息进行不同的处理。所述忘记门,顾名思义通过的信息将有一部分从神经单元中被遗忘,使上一帧的语音特征中的一部分在传递中消失,即不再会进入到下一个神经单元中进行训练;所述输入门的作用是将新的有用的信息添加到神经单元状态中去,即将这一帧新学习到的语音特征处理后,加入到传递的信息中去;最后所述输出门是基于以上神经单元状态和处理后的信息输出,根据上一时刻的输出和这一时刻的输入中将要输出的信息,最终得到该时刻的输出信息作为所述第一声纹特征。The first feature of the standard speech information is extracted by using an LSTM (Long Short-Term Memory, long short-term memory) network. The LSTM has three "gate" structures, namely, a forget gate, an input gate, and an output gate, which are used to process the input information differently. For the forget gate, as the name implies, part of the information passed through will be forgotten from the neural unit, so that part of the speech features of the previous frame will disappear during transmission, that is, it will no longer enter the next neural unit for training; the input The function of the gate is to add new useful information to the state of the neural unit, that is, after processing the newly learned speech features of this frame, add it to the transmitted information; finally, the output gate is based on the state of the neural unit above. and the processed information output, according to the output at the previous moment and the information to be output in the input at this moment, the output information at this moment is finally obtained as the first voiceprint feature.
S3、将所述第一声纹特征输入至预设关联图谱模型,得到与所述第一声纹特征相关的关联图谱数据。S3. Inputting the first voiceprint feature into a preset correlation atlas model to obtain correlation atlas data related to the first voiceprint feature.
本实施例中,与第一声纹特征相关的关联图谱数据可以包括但不限于第一声纹特征对应的用户标签数据,第一声纹特征对应的拨打记录。具体的,用户标签数据包括用户的属性特征数据例如:性别、年龄、地域、工作数据等。In this embodiment, the associated atlas data related to the first voiceprint feature may include, but is not limited to, user tag data corresponding to the first voiceprint feature, and dial records corresponding to the first voiceprint feature. Specifically, the user tag data includes attribute feature data of the user, such as gender, age, region, work data, and the like.
详细地,本实施例中,所述关联图谱模型可以用卷积神经网络进行构建,利用样本声纹特征作为训练集,利用用户标签数据标记过的样本声纹特征作为标签集进行训练完成关联图谱模型。In detail, in this embodiment, the association map model can be constructed by using a convolutional neural network, using the sample voiceprint features as a training set, and using the sample voiceprint features marked by user label data as a label set for training to complete the association map. Model.
例如:将某用户的第一声纹特征输入至预设关联图谱模型,得到与该第一声纹特征相关的关联图谱数据,如该第一声纹特征对应的用户的信息(姓名、性别、年龄、地域、工作等),或者该第一声纹特征对应的历史拨打时间和次数。For example: inputting the first voiceprint feature of a user into a preset correlation atlas model to obtain the associated atlas data related to the first voiceprint feature, such as the user's information (name, gender, age, region, work, etc.), or the historical dialing time and times corresponding to the first voiceprint feature.
S4、将所述关联图谱数据向量化,得到关联特征向量。S4. Vectorize the association map data to obtain an association feature vector.
详细地,通过以下表达式进行向量化:In detail, vectorization is done by the following expression:
其中,i表示所述关联图谱数据的编号,vi表示关联图谱数据i的N维矩阵向量,vj是所述N维矩阵向量的第j个元素。Wherein, i represents the serial number of the correlation map data, v i represents the N-dimensional matrix vector of the correlation map data i, and v j is the jth element of the N-dimensional matrix vector.
S5、判断预设黑声纹库中是否存在与所述第一声纹特征相匹配的声纹特征;以及判断预设黑关系图谱中是否存在与所述关联特征向量相匹配的标签特征向量。S5. Determine whether there is a voiceprint feature matching the first voiceprint feature in the preset black voiceprint library; and determine whether there is a label feature vector matching the associated feature vector in the preset black relationship map.
详细地,判断预设黑声纹库中是否存在与所述第一声纹特征相匹配的声纹特征包括:通过相似度函数分别计算所述第一声纹特征与预设黑声纹库中多个声纹特征的第一相似度;若存在大于第一相似度阈值的第一相似度,确定所述预设黑声纹库中存在与所述第一声纹特征相匹配的声纹特征。In detail, judging whether there is a voiceprint feature matching the first voiceprint feature in the preset black voiceprint library includes: calculating the first voiceprint feature and the preset black voiceprint library respectively through a similarity function. The first similarity of multiple voiceprint features; if there is a first similarity greater than the first similarity threshold, it is determined that there is a voiceprint feature matching the first voiceprint feature in the preset black voiceprint library .
或者,判断预设黑声纹库中是否存在与所述第一声纹特征相匹配的声纹特征包括:将所述第一声纹特征与预设黑声纹库中声纹特征进行相似度计算,得到第一相似度集,所述第一相似度集中的最大值为第一目标相似度,若第一目标相似度大于第一相似度阈值,确定所述预设黑声纹库中存在与所述第一声纹特征相匹配的声纹特征。Or, judging whether there is a voiceprint feature matching the first voiceprint feature in the preset black voiceprint library includes: comparing the first voiceprint feature with the voiceprint feature in the preset black voiceprint library for similarity Calculate to obtain a first similarity set, where the maximum value in the first similarity set is the first target similarity, and if the first target similarity is greater than the first similarity threshold, it is determined that the preset black voiceprint library exists A voiceprint feature matching the first voiceprint feature.
本实施例中,所述黑名单声纹库是通过提取黑名单人员的声音的声纹特征向量得到的声纹数据库。In this embodiment, the blacklist voiceprint database is a voiceprint database obtained by extracting voiceprint feature vectors of the voices of the blacklisted persons.
例如,黑名单声纹库包含银行的失信人员的声纹特征和/或公安部门的犯罪分子声纹特征库。For example, the blacklist voiceprint database contains the voiceprint features of untrustworthy persons in banks and/or the voiceprint feature database of criminals in public security departments.
进一步地,所述相似度函数为:Further, the similarity function is:
其中,x表示所述第一声纹特征,yi表示所述预设黑声纹库中声纹特征,n表示所述预设黑声纹库中声纹特征的数量,sim(x,yi)表示所述第一相似度。Wherein, x represents the first voiceprint feature, yi represents the voiceprint feature in the preset black voiceprint database, n represents the number of voiceprint features in the preset black voiceprint database, sim(x,y i ) represents the first similarity.
类似地,判断预设黑关系图谱中是否存在与所述关联特征向量相匹配的标签特征向量:通过相似度函数分别计算所述关联特征向量与预设黑关系图谱中多个标签特征向量的第二相似度;若存在大于第二相似度阈值的第二相似度,确定所述预设黑关系图谱中存在与所述关联特征向量相匹配的声纹特征。Similarly, judging whether there is a label feature vector that matches the associated feature vector in the preset black relational graph: calculate the number of the associated feature vector and the plurality of label feature vectors in the preset black relational graph by using the similarity function respectively. Second similarity; if there is a second similarity greater than a second similarity threshold, it is determined that there is a voiceprint feature matching the associated feature vector in the preset black relational map.
或者,判断预设黑关系图谱中是否存在与所述关联特征向量相匹配的标签特征向量:将所述关联特征向量与预设黑关系图谱中标签特征向量进行相似度计算,得到第二相似度集,所述第二相似度集中的最大值为第二目标相似度,若第二目标相似度大于第二相似度阈值,确定所述预设黑关系图谱中存在与所述关联特征向量相匹配的标签特征向量。Or, judging whether there is a label feature vector that matches the associated feature vector in the preset black relational graph: calculating the similarity between the associated feature vector and the label feature vector in the preset black relational graph to obtain a second similarity set, the maximum value in the second similarity set is the second target similarity, if the second target similarity is greater than the second similarity threshold, it is determined that there is a matching feature vector in the preset black relationship map that matches the associated feature vector The label feature vector of .
本实施例中,所述黑关系图谱数据库是通过提取黑名单人员的标签数据的标签特征向量得到的,因此,黑关系图谱数据库包含黑名单人员的标签数据的标签特征向量。In this embodiment, the black relational graph database is obtained by extracting the label feature vector of the label data of the blacklisted persons. Therefore, the black relational graph database includes the labelled feature vector of the labelled data of the blacklisted persons.
本实施例中,第二相似度阈值与第一相似度阈值可以相同或不同,第二相似度阈值可以大于第一相似度阈值,第二相似度阈值也可以小于第一相似度阈值。例如,第一相似度阈值为80%,第二相似度阈值为90%时。In this embodiment, the second similarity threshold and the first similarity threshold may be the same or different, the second similarity threshold may be greater than the first similarity threshold, and the second similarity threshold may also be smaller than the first similarity threshold. For example, when the first similarity threshold is 80%, and the second similarity threshold is 90%.
S6、若所述预设黑声纹库中存在与所述第一声纹特征相匹配的声纹特征,或者所述预设黑关系图谱中存在与所述关联特征向量相匹配的标签特征向量,确定所述用户为风险用户。S6. If there is a voiceprint feature matching the first voiceprint feature in the preset black voiceprint library, or if there is a tag feature vector matching the associated feature vector in the preset black relationship map , and determine that the user is a risk user.
若预设黑声纹库中存在与第一声纹特征相匹配的声纹特征,或者预设黑关系图谱中存在与关联特征向量相匹配的标签特征向量,识别用户为风险用户,可以更全面识别到风险用户,降低单项验证造成的风险用户漏检的情况。If there is a voiceprint feature matching the first voiceprint feature in the preset black voiceprint database, or there is a label feature vector matching the associated feature vector in the preset black relationship map, identifying the user as a risk user can be more comprehensive. Identify risky users and reduce the missed detection of risky users caused by single-item verification.
进一步地,若确定所述用户为风险用户,发送风险用户提醒消息。Further, if it is determined that the user is a risk user, a risk user reminder message is sent.
本发明实施例中,获取用户的标准语音信息;提取所述标准语音信息的第一声纹特征;将所述第一声纹特征输入至预设关联图谱模型,得到与所述第一声纹特征相关的关联图谱数据;将所述关联图谱数据向量化,得到关联特征向量;判断预设黑声纹库中是否存在与所述第一声纹特征相匹配的声纹特征;以及判断预设黑关系图谱中是否存在与所述关联特征向量相匹配的标签特征向量;若所述预设黑声纹库中存在与所述第一声纹特征相匹配的声纹特征,或者所述预设黑关系图谱中存在与所述关联特征向量相匹配的标签特征向量,确定所述用户为风险用户。通过两种渠道的双项验证,实现了降低识别风险用户的漏检率,进而增强信息的安全性的目的。In the embodiment of the present invention, the standard voice information of the user is obtained; the first voiceprint feature of the standard voice information is extracted; feature-related association map data; vectorize the association map data to obtain an association feature vector; determine whether there is a voiceprint feature matching the first voiceprint feature in the preset black voiceprint library; and determine the preset Whether there is a label feature vector matching the associated feature vector in the black relational map; if there is a voiceprint feature matching the first voiceprint feature in the preset black voiceprint library, or the preset There is a label feature vector matching the associated feature vector in the black relationship graph, and the user is determined to be a risk user. Through the double-item verification of the two channels, the purpose of reducing the missed detection rate of identifying risk users and enhancing the security of information is achieved.
如图2所示,是本发明基于声纹特征与关联图谱数据的风险用户识别装置的功能模块图。As shown in FIG. 2 , it is a functional block diagram of a risk user identification device based on voiceprint features and associated graph data according to the present invention.
本发明所述基于声纹特征与关联图谱数据的风险用户识别装置100可以安装于电子设备中。根据实现的功能,所述基于声纹特征与关联图谱数据的风险用户识别装置可以包括语音信息获取模块101、声纹特征提取模块102、图谱数据获取模块103、向量转换模块104、判断模块105、确定模块106。本发所述模块也可以称之为单元,是指一种能够被电子设备处理器所执行,并且能够完成固定功能的一系列计算机程序段,其存储在电子设备的存储器中。The apparatus 100 for identifying risk users based on voiceprint features and associated graph data according to the present invention can be installed in an electronic device. According to the implemented functions, the risk user identification device based on voiceprint features and associated atlas data may include a voice information acquisition module 101, a voiceprint feature extraction module 102, a atlas data acquisition module 103, a
在本实施例中,关于各模块/单元的功能如下:In this embodiment, the functions of each module/unit are as follows:
语音信息获取模块101用于获取用户的标准语音信息。The voice information obtaining module 101 is used to obtain the standard voice information of the user.
本实施例中,用户的标准语音信息可以是从语音数据库中获取到的。In this embodiment, the standard voice information of the user may be acquired from a voice database.
进一步地,所述获取用户的标准语音信息包括:Further, the obtaining the standard voice information of the user includes:
获取所述用户的原始语音信息;obtain the original voice information of the user;
利用模/数转换器对所述原始语音信息进行采样,得到数字语音信号;Utilize an analog/digital converter to sample the original voice information to obtain a digital voice signal;
对所述数字语音信号进行预加重操作,得到数字滤波语音信号;performing a pre-emphasis operation on the digital voice signal to obtain a digital filtered voice signal;
对所述数字滤波语音信号进行分帧加窗操作,得到所述标准语音信息。A frame-by-frame windowing operation is performed on the digitally filtered voice signal to obtain the standard voice information.
本实施例中,用户的原始语音信息为包含用户语音的音频信息,原始语音可以是与用户的语音通话中获取到的语音信息。In this embodiment, the user's original voice information is audio information including the user's voice, and the original voice may be voice information obtained during a voice call with the user.
详细地,对原始语音信息进行采样是为了将原始语音信息转化为数字信号,便于对语音信息进行处理。In detail, the purpose of sampling the original voice information is to convert the original voice information into a digital signal, so as to facilitate the processing of the voice information.
本实施例中,使用模/数转换器以每秒上万次的速率对原始语音信息进行采样,每一次采样都记录下了原始语音信息在某一时刻的状态,从而能够得到不同时刻的语音数字语音信号。In this embodiment, an analog-to-digital converter is used to sample the original voice information at a rate of tens of thousands of times per second, and each sampling records the state of the original voice information at a certain time, so that the voice at different times can be obtained. digital voice signal.
由于人声的发音系统会抑制高频部分,在本实施例中,通过上述方式进行预加重操作,可以提升高频部分能量,使高频部分的语音能量和低频部分的语音能量有相似的幅度,使信号的频谱变得平坦,保持在低频到高频的整个频带中能用同样的信噪比。Since the pronunciation system of the human voice will suppress the high-frequency part, in this embodiment, the pre-emphasis operation is performed in the above-mentioned manner to increase the energy of the high-frequency part, so that the speech energy of the high-frequency part and the speech energy of the low-frequency part have a similar amplitude , flatten the spectrum of the signal and keep the same signal-to-noise ratio in the entire frequency band from low frequency to high frequency.
在本实施例中,预加重操作可以对数字语音信号进行补偿。In this embodiment, the pre-emphasis operation can compensate the digital speech signal.
具体地,所述预加重操作可通过y(t)=x(t)-μx(t-1)进行计算,其中,x(t)为数字语音信号,t为时间,y(t)为所述数字滤波语音信号,μ为所述预加重操作的调节值,μ的取值范围为[0.9,1.0]。Specifically, the pre-emphasis operation can be calculated by y(t)=x(t)-μx(t-1), where x(t) is the digital voice signal, t is time, and y(t) is the The digitally filtered speech signal, μ is the adjustment value of the pre-emphasis operation, and the value range of μ is [0.9, 1.0].
本实施例中,分帧加窗操作是为了去除所述数字滤波语音信号中的语音的重叠部分。In this embodiment, the frame-by-frame windowing operation is to remove the overlapping part of the speech in the digitally filtered speech signal.
进一步地,所述对所述数字滤波语音信号进行分帧加窗操作包括:Further, the described framing and windowing operation to the digitally filtered speech signal includes:
通过目标函数对所述数字滤波语音信号进行分帧加窗操作,所述目标函数为:The frame-by-frame windowing operation is performed on the digitally filtered speech signal by an objective function, and the objective function is:
其中,n为数字滤波语音信号的帧数序列,N为所述数字滤波语音信号的总帧数,w(n)为标准语音信息的单帧数据,即w(n)表示每一帧的标准语音信息。Among them, n is the frame number sequence of the digitally filtered voice signal, N is the total number of frames of the digitally filtered voice signal, w(n) is the single frame data of the standard voice information, that is, w(n) represents the standard of each frame voice message.
声纹特征提取模块102用于提取所述标准语音信息的第一声纹特征。The voiceprint feature extraction module 102 is configured to extract the first voiceprint feature of the standard speech information.
详细地,提取所述标准语音信息的第一声纹特征,包括:In detail, extracting the first voiceprint feature of the standard voice information includes:
将所述标准语音信息进行离散傅里叶变换,得到所述标准语音信息的频谱信息;The standard voice information is subjected to discrete Fourier transform to obtain spectrum information of the standard voice information;
利用三角滤波器对所述标准语音信息进行三角滤波计算,得到所述标准语音信息的频率响应值;The triangular filter is used to perform triangular filtering calculation on the standard speech information to obtain the frequency response value of the standard speech information;
对所述频谱信息和所述频率响应值进行对数计算,得到对数能量;Perform logarithmic calculation on the spectrum information and the frequency response value to obtain logarithmic energy;
对所述对数能量进行离散余弦计算,得到所述第一声纹特征。Perform discrete cosine calculation on the logarithmic energy to obtain the first voiceprint feature.
较佳地,所述离散傅里叶变换包含的计算函数为:Preferably, the calculation function included in the discrete Fourier transform is:
其中,N为所述数字滤波语音信号的总帧数,n为所述数字滤波语音信号的帧数序列,w(n)为标准语音信息的单帧数据,即w(n)表示每一帧的标准语音信息,j为所述傅里叶变换的权值,k为所述数字滤波语音信号中单帧的声音频率,D为频谱信息。Wherein, N is the total number of frames of the digitally filtered voice signal, n is the frame number sequence of the digitally filtered voice signal, w(n) is the single frame data of standard voice information, that is, w(n) represents each frame The standard speech information of , j is the weight of the Fourier transform, k is the sound frequency of a single frame in the digitally filtered speech signal, and D is the spectrum information.
优选地,在本实施例中,定义一个有M个滤波器(滤波器可以为三角滤波器)的滤波器组,滤波器的中心频率为f(i),i=1,2,…,M,所述中心频率为滤波器的截止频率,通过三角滤波器进行三角滤波计算。Preferably, in this embodiment, a filter bank with M filters (the filters can be triangular filters) is defined, and the center frequency of the filters is f(i), i=1, 2, . . . , M , the center frequency is the cutoff frequency of the filter, and the triangular filter is used to calculate the triangular filter.
由于三角滤波器可以对频谱进行平滑,并消除谐波的作用,突显声音的共振峰。因此一段声音的音调或音高,不会反应在声纹特征内,也就是说所述声纹特征并不会受到输入声音的音调不同而对识别结果有所影响。Since the triangular filter smoothes the spectrum and removes the effects of harmonics, the formants of the sound are highlighted. Therefore, the pitch or pitch of a piece of sound will not be reflected in the voiceprint feature, that is to say, the voiceprint feature will not be affected by the different pitches of the input sound and thus affect the recognition result.
优选的,所述三角滤波计算如下:Preferably, the triangular filtering is calculated as follows:
其中f(i)为三角滤波器的中心频率,i为三角滤波器的组别,H(k)为频率响应值,k为所述数字滤波语音信号中单帧的声音频率,即k可以表示每一帧的声音频率。where f(i) is the center frequency of the triangular filter, i is the group of the triangular filter, H(k) is the frequency response value, and k is the sound frequency of a single frame in the digitally filtered speech signal, that is, k can represent The sound frequency for each frame.
进一步地,对数变换是计算每个滤波器组输出的对数能量。Further, the logarithmic transformation is to calculate the logarithmic energy of each filter bank output.
一般人对声音声压的反应呈对数关系,人对高声压的细微变化敏感度不如低声压。因此,在本实施例中使用对数可以降低提取的特征对输入声音能量变化的敏感度。The average person's response to sound pressure is logarithmic, and people are less sensitive to subtle changes in high sound pressure than low sound pressure. Therefore, using logarithms in this embodiment can reduce the sensitivity of the extracted features to changes in the input sound energy.
具体地,可通过以下公式进行对数计算:Specifically, the logarithmic calculation can be performed by the following formula:
其中i为三角滤波器的组别,k为所述原始语音信息的单帧的声音频率,N为所述数字滤波语音信号的总帧数,n为所述数字滤波语音信号的帧数序列,D为频谱信息,S(i)为每个滤波器输出的对数能量。Wherein i is the group of the triangular filter, k is the sound frequency of the single frame of the original voice information, N is the total number of frames of the digitally filtered voice signal, n is the frame number sequence of the digitally filtered voice signal, D is the spectral information, and S(i) is the logarithmic energy output by each filter.
优选地,S(i)经过离散余弦变换得到声纹特征,所述离散余弦变换如下:Preferably, S(i) is subjected to discrete cosine transform to obtain voiceprint features, and the discrete cosine transform is as follows:
其中n为原始语音信息的帧数序列,i为三角滤波器的组别,M为三角滤波器的总组数,S(i)为每个滤波器输出的对数能量,x为所述声纹特征。where n is the sequence of frame numbers of the original speech information, i is the group of triangular filters, M is the total number of groups of triangular filters, S(i) is the logarithmic energy output by each filter, and x is the sound pattern features.
进一步地,在本发明的另一实施例中所述提取所述标准语音信息的第一特征包括:Further, in another embodiment of the present invention, the first feature of extracting the standard voice information includes:
利用LSTM(Long Short-Term Memory,长短期记忆)网络提取所述标准语音信息的第一特征。所述LSTM具有三个“门”结构,分别为忘记门(forget gate)、输入门(inputgate)、输出门(output gate),用于对输入的信息进行不同的处理。所述忘记门,顾名思义通过的信息将有一部分从神经单元中被遗忘,使上一帧的语音特征中的一部分在传递中消失,即不再会进入到下一个神经单元中进行训练;所述输入门的作用是将新的有用的信息添加到神经单元状态中去,即将这一帧新学习到的语音特征处理后,加入到传递的信息中去;最后所述输出门是基于以上神经单元状态和处理后的信息输出,根据上一时刻的输出和这一时刻的输入中将要输出的信息,最终得到该时刻的输出信息作为所述第一声纹特征。The first feature of the standard speech information is extracted by using an LSTM (Long Short-Term Memory, long short-term memory) network. The LSTM has three "gate" structures, namely, a forget gate, an input gate, and an output gate, which are used to process the input information differently. For the forget gate, as the name implies, part of the information passed through will be forgotten from the neural unit, so that part of the speech features of the previous frame will disappear during transmission, that is, it will no longer enter the next neural unit for training; the input The function of the gate is to add new useful information to the state of the neural unit, that is, after processing the newly learned speech features of this frame, add it to the transmitted information; finally, the output gate is based on the state of the neural unit above. and the processed information output, according to the output at the previous moment and the information to be output in the input at this moment, the output information at this moment is finally obtained as the first voiceprint feature.
图谱数据获取模块103用于将所述第一声纹特征输入至预设关联图谱模型,得到与所述第一声纹特征相关的关联图谱数据。The atlas data acquisition module 103 is configured to input the first voiceprint feature into a preset associated atlas model to obtain associated atlas data related to the first voiceprint feature.
本实施例中,与第一声纹特征相关的关联图谱数据可以包括但不限于第一声纹特征对应的用户标签数据,第一声纹特征对应的拨打记录。具体的,用户标签数据包括用户的属性特征数据例如:性别、年龄、地域、工作数据等。In this embodiment, the associated atlas data related to the first voiceprint feature may include, but is not limited to, user tag data corresponding to the first voiceprint feature, and dial records corresponding to the first voiceprint feature. Specifically, the user tag data includes attribute feature data of the user, such as gender, age, region, work data, and the like.
详细地,本实施例中,所述关联图谱模型可以用卷积神经网络进行构建,利用样本声纹特征作为训练集,利用用户标签数据标记过的样本声纹特征作为标签集进行训练完成关联图谱模型。In detail, in this embodiment, the association map model can be constructed by using a convolutional neural network, using the sample voiceprint features as a training set, and using the sample voiceprint features marked by user label data as a label set for training to complete the association map. Model.
向量转换模块104用于将所述关联图谱数据向量化,得到关联特征向量。The
详细地,通过以下表达式进行向量化:In detail, vectorization is done by the following expression:
其中,i表示所述关联图谱数据的编号,vi表示关联图谱数据i的N维矩阵向量,vj是所述N维矩阵向量的第j个元素。Wherein, i represents the serial number of the correlation map data, v i represents the N-dimensional matrix vector of the correlation map data i, and v j is the jth element of the N-dimensional matrix vector.
判断模块105用于判断预设黑声纹库中是否存在与所述第一声纹特征相匹配的声纹特征;所述判断模块,还用于判断预设黑关系图谱中是否存在与所述关联特征向量相匹配的标签特征向量。The
详细地,判断预设黑声纹库中是否存在与所述第一声纹特征相匹配的声纹特征包括:通过相似度函数分别计算所述第一声纹特征与预设黑声纹库中多个声纹特征的第一相似度;若存在大于第一相似度阈值的第一相似度,确定所述预设黑声纹库中存在与所述第一声纹特征相匹配的声纹特征。In detail, judging whether there is a voiceprint feature matching the first voiceprint feature in the preset black voiceprint library includes: calculating the first voiceprint feature and the preset black voiceprint library respectively through a similarity function. The first similarity of multiple voiceprint features; if there is a first similarity greater than the first similarity threshold, it is determined that there is a voiceprint feature matching the first voiceprint feature in the preset black voiceprint library .
或者,判断预设黑声纹库中是否存在与所述第一声纹特征相匹配的声纹特征包括:将所述第一声纹特征与预设黑声纹库中声纹特征进行相似度计算,得到第一相似度集,所述第一相似度集中的最大值为第一目标相似度,若第一目标相似度大于第一相似度阈值,确定所述预设黑声纹库中存在与所述第一声纹特征相匹配的声纹特征。Or, judging whether there is a voiceprint feature matching the first voiceprint feature in the preset black voiceprint library includes: comparing the first voiceprint feature with the voiceprint feature in the preset black voiceprint library for similarity Calculate to obtain a first similarity set, where the maximum value in the first similarity set is the first target similarity, and if the first target similarity is greater than the first similarity threshold, it is determined that the preset black voiceprint library exists A voiceprint feature matching the first voiceprint feature.
本实施例中,所述黑名单声纹库是通过提取黑名单人员的声音的声纹特征向量得到的声纹数据库。In this embodiment, the blacklist voiceprint database is a voiceprint database obtained by extracting voiceprint feature vectors of the voices of the blacklisted persons.
例如,黑名单声纹库包含银行的失信人员的声纹特征和/或公安部门的犯罪分子声纹特征库。For example, the blacklist voiceprint database contains the voiceprint features of untrustworthy persons in banks and/or the voiceprint feature database of criminals in public security departments.
进一步地,所述相似度函数为:Further, the similarity function is:
其中,x表示所述第一声纹特征,yi表示所述预设黑声纹库中声纹特征,n表示所述预设黑声纹库中声纹特征的数量,sim(x,yi)表示所述第一相似度。Wherein, x represents the first voiceprint feature, yi represents the voiceprint feature in the preset black voiceprint database, n represents the number of voiceprint features in the preset black voiceprint database, sim(x,y i ) represents the first similarity.
类似地,判断预设黑关系图谱中是否存在与所述关联特征向量相匹配的标签特征向量:通过相似度函数分别计算所述关联特征向量与预设黑关系图谱中多个标签特征向量的第二相似度;若存在大于第二相似度阈值的第二相似度,确定所述预设黑关系图谱中存在与所述关联特征向量相匹配的声纹特征。Similarly, judging whether there is a label feature vector that matches the associated feature vector in the preset black relational graph: calculate the number of the associated feature vector and the plurality of label feature vectors in the preset black relational graph through the similarity function respectively. Second similarity; if there is a second similarity greater than a second similarity threshold, it is determined that there is a voiceprint feature matching the associated feature vector in the preset black relational map.
或者,判断预设黑关系图谱中是否存在与所述关联特征向量相匹配的标签特征向量:将所述关联特征向量与预设黑关系图谱中标签特征向量进行相似度计算,得到第二相似度集,所述第二相似度集中的最大值为第二目标相似度,若第二目标相似度大于第二相似度阈值,确定所述预设黑关系图谱中存在与所述关联特征向量相匹配的标签特征向量。Or, judging whether there is a label feature vector that matches the associated feature vector in the preset black relational graph: calculating the similarity between the associated feature vector and the label feature vector in the preset black relational graph to obtain a second similarity set, the maximum value in the second similarity set is the second target similarity, if the second target similarity is greater than the second similarity threshold, it is determined that there is a matching feature vector in the preset black relationship map that matches the associated feature vector The label feature vector of .
本实施例中,所述黑关系图谱数据库是通过提取黑名单人员的标签数据的标签特征向量得到的,因此,黑关系图谱数据库包含黑名单人员的标签数据的标签特征向量。In this embodiment, the black relational graph database is obtained by extracting the label feature vector of the label data of the blacklisted persons. Therefore, the black relational graph database includes the labelled feature vector of the labelled data of the blacklisted persons.
确定模块106用于若所述预设黑声纹库中存在与所述第一声纹特征相匹配的声纹特征,或者所述预设黑关系图谱中存在与所述关联特征向量相匹配的标签特征向量,确定所述用户为风险用户。The determining
若预设黑声纹库中存在与第一声纹特征相匹配的声纹特征,或者预设黑关系图谱中存在与关联特征向量相匹配的标签特征向量,识别用户为风险用户,可以更全面且准确的识别到风险用户。If there is a voiceprint feature matching the first voiceprint feature in the preset black voiceprint database, or there is a label feature vector matching the associated feature vector in the preset black relationship map, identifying the user as a risk user can be more comprehensive. And accurately identify risk users.
进一步地,若确定所述用户为风险用户,发送风险用户提醒消息。Further, if it is determined that the user is a risk user, a risk user reminder message is sent.
如图3所示,是本发明实现基于声纹特征与关联图谱数据的风险用户识别方法的电子设备的结构示意图。As shown in FIG. 3 , it is a schematic structural diagram of an electronic device implementing the method for identifying risk users based on voiceprint features and associated graph data according to the present invention.
所述电子设备1可以包括处理器10、存储器11和总线,还可以包括存储在所述存储器11中并可在所述处理器10上运行的计算机程序。The electronic device 1 may include a processor 10 , a memory 11 and a bus, and may also include a computer program stored in the memory 11 and executable on the processor 10 .
其中,所述存储器11至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、移动硬盘、多媒体卡、卡型存储器(例如:SD或DX存储器等)、磁性存储器、磁盘、光盘等。所述存储器11在一些实施例中可以是电子设备1的内部存储单元,例如该电子设备1的移动硬盘。所述存储器11在另一些实施例中也可以是电子设备1的外部存储设备,例如电子设备1上配备的插接式移动硬盘、智能存储卡(Smart Media Card,SMC)、安全数字(SecureDigital,SD)卡、闪存卡(Flash Card)等。进一步地,所述存储器11还可以既包括电子设备1的内部存储单元也包括外部存储设备。所述存储器11不仅可以用于存储安装于电子设备1的应用软件及各类数据,例如基于声纹特征与关联图谱数据的风险用户识别程序的代码等,还可以用于暂时地存储已经输出或者将要输出的数据。Wherein, the memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, mobile hard disk, multimedia card, card-type memory (for example: SD or DX memory, etc.), magnetic memory, magnetic disk, CD etc. The memory 11 may be an internal storage unit of the electronic device 1 in some embodiments, such as a mobile hard disk of the electronic device 1 . In other embodiments, the memory 11 may also be an external storage device of the electronic device 1, such as a pluggable mobile hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, flash memory card (Flash Card), etc. Further, the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device. The memory 11 can not only be used to store the application software and various data installed in the electronic device 1, such as the code of the risk user identification program based on the voiceprint feature and the associated map data, etc., but also can be used to temporarily store the output or data to be output.
所述处理器10在一些实施例中可以由集成电路组成,例如可以由单个封装的集成电路所组成,也可以是由多个相同功能或不同功能封装的集成电路所组成,包括一个或者多个中央处理器(Central Processing unit,CPU)、微处理器、数字处理芯片、图形处理器及各种控制芯片的组合等。所述处理器10是所述电子设备的控制核心(Control Unit),利用各种接口和线路连接整个电子设备的各个部件,通过运行或执行存储在所述存储器11内的程序或者模块(例如执行基于声纹特征与关联图谱数据的风险用户识别程序等),以及调用存储在所述存储器11内的数据,以执行电子设备1的各种功能和处理数据。In some embodiments, the processor 10 may be composed of integrated circuits, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits packaged with the same function or different functions, including one or more integrated circuits. Central processing unit (Central Processing Unit, CPU), microprocessor, digital processing chip, graphics processor and combination of various control chips, etc. The processor 10 is the control core (Control Unit) of the electronic device, and uses various interfaces and lines to connect various components of the entire electronic device, and by running or executing the program or module (for example, executing the program) stored in the memory 11. Risk user identification program based on voiceprint features and associated graph data, etc.), and call the data stored in the memory 11 to execute various functions of the electronic device 1 and process data.
所述总线可以是外设部件互连标准(peripheral component interconnect,简称PCI)总线或扩展工业标准结构(extended industry standard architecture,简称EISA)总线等。该总线可以分为地址总线、数据总线、控制总线等。所述总线被设置为实现所述存储器11以及至少一个处理器12等之间的连接通信。The bus may be a peripheral component interconnect (PCI for short) bus or an extended industry standard architecture (extended industry standard architecture, EISA for short) bus or the like. The bus can be divided into address bus, data bus, control bus and so on. The bus is configured to enable connection communication between the memory 11 and at least one processor 12 and the like.
图3仅示出了具有部件的电子设备,本领域技术人员可以理解的是,图3示出的结构并不构成对所述电子设备1的限定,可以包括比图示更少或者更多的部件,或者组合某些部件,或者不同的部件布置。FIG. 3 only shows an electronic device with components. Those skilled in the art can understand that the structure shown in FIG. 3 does not constitute a limitation on the electronic device 1, and may include fewer or more components than those shown in the figure. components, or a combination of certain components, or a different arrangement of components.
例如,尽管未示出,所述电子设备1还可以包括给各个部件供电的电源(比如电池),优选地,电源可以通过电源管理装置与所述至少一个处理器10逻辑相连,从而通过电源管理装置实现充电管理、放电管理、以及功耗管理等功能。电源还可以包括一个或一个以上的直流或交流电源、再充电装置、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。所述电子设备1还可以包括多种传感器、蓝牙模块、Wi-Fi模块等,在此不再赘述。For example, although not shown, the electronic device 1 may also include a power supply (such as a battery) for powering the various components, preferably, the power supply may be logically connected to the at least one processor 10 through a power management device, so that the power management The device implements functions such as charge management, discharge management, and power consumption management. The power source may also include one or more DC or AC power sources, recharging devices, power failure detection circuits, power converters or inverters, power status indicators, and any other components. The electronic device 1 may further include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
进一步地,所述电子设备1还可以包括网络接口,可选地,所述网络接口可以包括有线接口和/或无线接口(如WI-FI接口、蓝牙接口等),通常用于在该电子设备1与其他电子设备之间建立通信连接。Further, the electronic device 1 may also include a network interface, optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which is usually used in the electronic device 1 Establish a communication connection with other electronic devices.
可选地,该电子设备1还可以包括用户接口,用户接口可以是显示器(Display)、输入单元(比如键盘(Keyboard)),可选地,用户接口还可以是标准的有线接口、无线接口。可选地,在一些实施例中,显示器可以是LED显示器、液晶显示器、触控式液晶显示器以及OLED(Organic Light-Emitting Diode,有机发光二极管)触摸器等。其中,显示器也可以适当的称为显示屏或显示单元,用于显示在电子设备1中处理的信息以及用于显示可视化的用户界面。Optionally, the electronic device 1 may further include a user interface, and the user interface may be a display (Display), an input unit (eg, a keyboard (Keyboard)), optionally, the user interface may also be a standard wired interface or a wireless interface. Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, and the like. The display may also be appropriately called a display screen or a display unit, which is used for displaying information processed in the electronic device 1 and for displaying a visualized user interface.
应该了解,所述实施例仅为说明之用,在专利申请范围上并不受此结构的限制。It should be understood that the embodiments are only used for illustration, and are not limited by this structure in the scope of the patent application.
所述电子设备1中的所述存储器11存储的基于声纹特征与关联图谱数据的风险用户识别程序12是多个指令的组合,在所述处理器10中运行时,可以实现:The risk user identification program 12 based on voiceprint features and associated graph data stored in the memory 11 of the electronic device 1 is a combination of multiple instructions, and when running in the processor 10, can realize:
获取用户的标准语音信息;Obtain the user's standard voice information;
提取所述标准语音信息的第一声纹特征;extracting the first voiceprint feature of the standard voice information;
将所述第一声纹特征输入至预设关联图谱模型,得到与所述第一声纹特征相关的关联图谱数据;Inputting the first voiceprint feature into a preset association atlas model to obtain association atlas data related to the first voiceprint feature;
将所述关联图谱数据向量化,得到关联特征向量;Vectorizing the association map data to obtain an association feature vector;
判断预设黑声纹库中是否存在与所述第一声纹特征相匹配的声纹特征;以及Judging whether there is a voiceprint feature matching the first voiceprint feature in the preset black voiceprint library; and
判断预设黑关系图谱中是否存在与所述关联特征向量相匹配的标签特征向量;Judging whether there is a label feature vector that matches the associated feature vector in the preset black relational graph;
若所述预设黑声纹库中存在与所述第一声纹特征相匹配的声纹特征,或者所述预设黑关系图谱中存在与所述关联特征向量相匹配的标签特征向量,确定所述用户为风险用户。If there is a voiceprint feature matching the first voiceprint feature in the preset black voiceprint library, or if there is a label feature vector matching the associated feature vector in the preset black relationship map, determine The user is a risk user.
具体地,所述处理器10对上述指令的具体实现方法可参考图1对应实施例中相关步骤的描述,在此不赘述。Specifically, for the specific implementation method of the above-mentioned instruction by the processor 10, reference may be made to the description of the relevant steps in the corresponding embodiment of FIG. 1, and details are not described herein.
进一步地,所述电子设备1集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)。Further, if the modules/units integrated in the electronic device 1 are implemented in the form of software functional units and sold or used as independent products, they may be stored in a computer-readable storage medium. The computer-readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM, Read-Only Memory) .
在本发明所提供的几个实施例中,应该理解到,所揭露的设备,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus, apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are only illustrative. For example, the division of the modules is only a logical function division, and there may be other division manners in actual implementation.
所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The modules described as separate components may or may not be physically separated, and components shown as modules may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
另外,在本发明各个实施例中的各功能模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能模块的形式实现。In addition, each functional module in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware, or can be implemented in the form of hardware plus software function modules.
对于本领域技术人员而言,显然本发明不限于上述示范性实施例的细节,而且在不背离本发明的精神或基本特征的情况下,能够以其他的具体形式实现本发明。It will be apparent to those skilled in the art that the present invention is not limited to the details of the above-described exemplary embodiments, but that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics of the invention.
因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本发明的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本发明内。不应将权利要求中的任何附关联图标记视为限制所涉及的权利要求。Therefore, the embodiments are to be regarded in all respects as illustrative and not restrictive, and the scope of the invention is to be defined by the appended claims rather than the foregoing description, which are therefore intended to fall within the scope of the claims. All changes within the meaning and range of the equivalents of , are included in the present invention. Any reference signs in the claims shall not be construed as limiting the involved claim.
此外,显然“包括”一词不排除其他单元或步骤,单数不排除复数。系统权利要求中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第二等词语用来表示名称,而并不表示任何特定的顺序。Furthermore, it is clear that the word "comprising" does not exclude other units or steps and the singular does not exclude the plural. Several units or means recited in the system claims can also be realized by one unit or means by means of software or hardware. Second-class terms are used to denote names and do not denote any particular order.
最后应说明的是,以上实施例仅用以说明本发明的技术方案而非限制,尽管参照较佳实施例对本发明进行了详细说明,本领域的普通技术人员应当理解,可以对本发明的技术方案进行修改或等同替换,而不脱离本发明技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and not to limit them. Although the present invention has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the present invention can be Modifications or equivalent substitutions can be made without departing from the spirit and scope of the technical solutions of the present invention.
Claims (10)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010253799.0A CN111552832A (en) | 2020-04-01 | 2020-04-01 | Risk user identification method and device based on voiceprint features and associated graph data |
| PCT/CN2020/106017 WO2021196477A1 (en) | 2020-04-01 | 2020-07-30 | Risk user identification method and apparatus based on voiceprint characteristics and associated graph data |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010253799.0A CN111552832A (en) | 2020-04-01 | 2020-04-01 | Risk user identification method and device based on voiceprint features and associated graph data |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN111552832A true CN111552832A (en) | 2020-08-18 |
Family
ID=72004275
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202010253799.0A Pending CN111552832A (en) | 2020-04-01 | 2020-04-01 | Risk user identification method and device based on voiceprint features and associated graph data |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN111552832A (en) |
| WO (1) | WO2021196477A1 (en) |
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113393318A (en) * | 2021-06-10 | 2021-09-14 | 中国工商银行股份有限公司 | Bank card application wind control method and device, electronic equipment and medium |
| CN113590873A (en) * | 2021-07-23 | 2021-11-02 | 中信银行股份有限公司 | Processing method and device for white list voiceprint feature library and electronic equipment |
| CN114783444A (en) * | 2022-05-06 | 2022-07-22 | 北京明略昭辉科技有限公司 | Voiceprint recognition method, device, storage medium and electronic device |
| CN115730503A (en) * | 2021-08-26 | 2023-03-03 | 久瓴(江苏)数字智能科技有限公司 | Heating and ventilation equipment monitoring method, device and equipment based on artificial intelligence |
| CN116153319A (en) * | 2023-01-13 | 2023-05-23 | 国网江苏省电力有限公司营销服务中心 | A high-risk user detection method and system based on voiceprint recognition |
| CN116486818A (en) * | 2022-08-30 | 2023-07-25 | 重庆蚂蚁消费金融有限公司 | Speech-based identity recognition method, device and electronic equipment |
Families Citing this family (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114387974B (en) * | 2021-12-13 | 2025-09-05 | 厦门快商通科技股份有限公司 | A case-linking method, system, device and storage medium based on voiceprint recognition |
| CN116013327A (en) * | 2022-12-15 | 2023-04-25 | 平安银行股份有限公司 | Anti-money laundering risk detection method, device, computer equipment and readable storage medium |
| CN116230230A (en) * | 2023-03-10 | 2023-06-06 | 深圳市品声科技有限公司 | Method and system for monitoring human health |
| CN119805188B (en) * | 2024-12-20 | 2025-08-01 | 通辽第二发电有限责任公司 | Intelligent voiceprint monitoring, diagnosing and analyzing system for high-voltage switch |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107481720A (en) * | 2017-06-30 | 2017-12-15 | 百度在线网络技术(北京)有限公司 | A kind of explicit method for recognizing sound-groove and device |
| CN107993071A (en) * | 2017-11-21 | 2018-05-04 | 平安科技(深圳)有限公司 | Electronic device, auth method and storage medium based on vocal print |
| CN110047490A (en) * | 2019-03-12 | 2019-07-23 | 平安科技(深圳)有限公司 | Method for recognizing sound-groove, device, equipment and computer readable storage medium |
| CN110767238A (en) * | 2019-09-19 | 2020-02-07 | 平安科技(深圳)有限公司 | Blacklist identification method, apparatus, device and storage medium based on address information |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8346659B1 (en) * | 2001-07-06 | 2013-01-01 | Hossein Mohsenzadeh | Secure authentication and payment system |
| CN109428719B (en) * | 2017-08-22 | 2023-01-10 | 创新先进技术有限公司 | An identity verification method, device and equipment |
| CN110896352B (en) * | 2018-09-12 | 2022-07-08 | 阿里巴巴集团控股有限公司 | Identity recognition method, device and system |
| CN110738998A (en) * | 2019-09-11 | 2020-01-31 | 深圳壹账通智能科技有限公司 | Voice-based personal credit evaluation method, device, terminal and storage medium |
| CN110855740B (en) * | 2019-09-27 | 2021-03-19 | 深圳市火乐科技发展有限公司 | Information pushing method and related equipment |
-
2020
- 2020-04-01 CN CN202010253799.0A patent/CN111552832A/en active Pending
- 2020-07-30 WO PCT/CN2020/106017 patent/WO2021196477A1/en not_active Ceased
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107481720A (en) * | 2017-06-30 | 2017-12-15 | 百度在线网络技术(北京)有限公司 | A kind of explicit method for recognizing sound-groove and device |
| CN107993071A (en) * | 2017-11-21 | 2018-05-04 | 平安科技(深圳)有限公司 | Electronic device, auth method and storage medium based on vocal print |
| CN110047490A (en) * | 2019-03-12 | 2019-07-23 | 平安科技(深圳)有限公司 | Method for recognizing sound-groove, device, equipment and computer readable storage medium |
| CN110767238A (en) * | 2019-09-19 | 2020-02-07 | 平安科技(深圳)有限公司 | Blacklist identification method, apparatus, device and storage medium based on address information |
Cited By (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113393318A (en) * | 2021-06-10 | 2021-09-14 | 中国工商银行股份有限公司 | Bank card application wind control method and device, electronic equipment and medium |
| CN113393318B (en) * | 2021-06-10 | 2025-01-14 | 中国工商银行股份有限公司 | Bank card application risk control method, device, electronic device and medium |
| CN113590873A (en) * | 2021-07-23 | 2021-11-02 | 中信银行股份有限公司 | Processing method and device for white list voiceprint feature library and electronic equipment |
| CN115730503A (en) * | 2021-08-26 | 2023-03-03 | 久瓴(江苏)数字智能科技有限公司 | Heating and ventilation equipment monitoring method, device and equipment based on artificial intelligence |
| CN114783444A (en) * | 2022-05-06 | 2022-07-22 | 北京明略昭辉科技有限公司 | Voiceprint recognition method, device, storage medium and electronic device |
| CN116486818A (en) * | 2022-08-30 | 2023-07-25 | 重庆蚂蚁消费金融有限公司 | Speech-based identity recognition method, device and electronic equipment |
| CN116153319A (en) * | 2023-01-13 | 2023-05-23 | 国网江苏省电力有限公司营销服务中心 | A high-risk user detection method and system based on voiceprint recognition |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2021196477A1 (en) | 2021-10-07 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN111552832A (en) | Risk user identification method and device based on voiceprint features and associated graph data | |
| WO2021208287A1 (en) | Voice activity detection method and apparatus for emotion recognition, electronic device, and storage medium | |
| CN111523389A (en) | Intelligent emotion recognition method and device, electronic equipment and storage medium | |
| CN111223488A (en) | Voice wake-up method, device, equipment and storage medium | |
| CN110619568A (en) | Risk assessment report generation method, device, equipment and storage medium | |
| CN113793620B (en) | Voice noise reduction method, device and equipment based on scene classification and storage medium | |
| CN111754982B (en) | Noise elimination method, device, electronic device and storage medium for voice call | |
| CN110457432A (en) | Interview methods of marking, device, equipment and storage medium | |
| CN113807103B (en) | Recruitment method, device, equipment and storage medium based on artificial intelligence | |
| CN113903363B (en) | Violation behavior detection method, device, equipment and medium based on artificial intelligence | |
| CN113064994A (en) | Conference quality evaluation method, device, equipment and storage medium | |
| CN106205624A (en) | A kind of method for recognizing sound-groove based on DBSCAN algorithm | |
| CN109545226B (en) | Voice recognition method, device and computer readable storage medium | |
| CN113808577A (en) | Intelligent extraction method, device, electronic device and storage medium for speech abstract | |
| CN118155634A (en) | Speech recognition identity method, device and storage medium | |
| CN118212927B (en) | Identity recognition method and system based on sound characteristics, storage medium and electronic equipment | |
| CN111985231B (en) | Unsupervised role recognition method and device, electronic equipment and storage medium | |
| CN115631748A (en) | Emotion recognition method and device based on voice conversation, electronic equipment and medium | |
| CN118398225B (en) | A multimodal analysis method for mining traditional Chinese medicine knowledge | |
| CN119761334A (en) | Work order generation method, device, equipment and medium based on natural language processing | |
| CN113035230A (en) | Authentication model training method and device and electronic equipment | |
| CN119629636A (en) | Spam call identification method, device, computer equipment and storage medium | |
| CN113555026B (en) | Voice conversion method, device, electronic equipment and medium | |
| CN118351873A (en) | An identity authentication method and system based on voiceprint and keyword dual recognition | |
| CN114974205A (en) | Synthetic speech recognition method, device, electronic equipment and storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20200818 |
|
| WD01 | Invention patent application deemed withdrawn after publication |