TWI904664B

TWI904664B - System and method for voice service with customer identification

Info

Publication number: TWI904664B
Application number: TW113117199A
Authority: TW
Inventors: 李孟達; 黃惠貞; 孫倩如; 陳駿元
Original assignee: 兆豐國際商業銀行股份有限公司
Priority date: 2024-05-09
Filing date: 2024-05-09
Publication date: 2025-11-11
Also published as: TW202544793A

Abstract

A voice service system with customer identification and method thereof are provided. The voice service system includes a server and a voice service device, the voice service device includes a processor, storage medium, a transceiver, a first side sound receiving and reproducing device, and a second side sound receiving and reproducing device. The processor is configured to transmit identity information of a customer to the server, to obtain the voice setting corresponding to the customer from the server; adjust at least one of the frequency value and the volume value of the voice setting, in response to recognizing semantic keywords in the voice signal received from the first side sound receiving and reproducing device; adjust the frequency and volume of the voice signal according to the voice setting; and transmit the voice signal to the second side sound receiving and reproducing device.

Description

Voice service system and methods with customer identification

本發明是有關於一種語音服務技術，且特別是有關於一種具客戶識別之語音服務系統和其方法。This invention relates to a voice service technology, and more particularly to a voice service system and method with customer identification.

隨著台灣逐步邁向超高齡社會，服務業所需服務的對象將面臨越來越多的高齡者。對於高齡者來說，除了年紀增長帶來的衰老，也可能因疾病等其他因素導致針對特定頻率的人聲的聽力衰退，再加上服務業的營業櫃檯環境相對吵雜，使得高齡者作為客戶臨櫃辦理業務與服務人員溝通時遭遇困難，因而讓客戶無法有好的服務體驗。As Taiwan gradually moves towards a super-aged society, the service industry will face an increasing number of elderly customers. In addition to aging, the elderly may experience hearing loss at specific frequencies due to illness or other factors. Furthermore, the relatively noisy environment of service industry counters makes it difficult for them to communicate with staff when conducting business, resulting in a poor service experience.

有鑑於此，本發明提供一種具客戶識別之語音服務系統和其方法，可基於客戶針對服務人員的語音的反應，調整服務人員的語音的頻率和音量，使客戶與服務人員能夠順利溝通，因而提升客戶的服務體驗和節省辦理業務所需耗費的時間。In view of this, the present invention provides a voice service system and method with customer identification, which can adjust the frequency and volume of the service personnel's voice based on the customer's response to the service personnel's voice, so as to enable the customer and the service personnel to communicate smoothly, thereby improving the customer's service experience and saving the time required to handle business.

本發明的一種具客戶識別的語音服務系統，包含伺服器以及語音服務裝置。語音服務裝置包含處理器、儲存媒體、收發器、第一側收擴音裝置和第二側收擴音裝置。伺服器儲存語音設定。收發器通訊連接至伺服器。處理器，耦接收發器、儲存媒體、第一側收擴音裝置和第二側收擴音裝置，其中處理器經配置以將客戶的身分驗證資訊傳送至伺服器，以自伺服器取得對應客戶的語音設定；響應於從第一側收擴音裝置接收的語音訊號中辨識出語意關鍵字，調整語音設定的頻率值和音量值的至少一者；根據語音設定對語音訊號的頻率和音量進行調整；以及將語音訊號傳送至第二側收擴音裝置。This invention discloses a voice service system with customer identification, comprising a server and a voice service device. The voice service device includes a processor, storage media, a transceiver, a first-side receiver/amplifier, and a second-side receiver/amplifier. The server stores voice settings. The transceiver is communicatively connected to the server. A processor coupled to a transceiver, a storage medium, a first-side amplification device, and a second-side amplification device, wherein the processor is configured to transmit customer authentication information to a server to obtain voice settings corresponding to the customer from the server; in response to identifying semantic keywords in a voice signal received from the first-side amplification device, adjusting at least one of the frequency and volume values of the voice settings; adjusting the frequency and volume of the voice signal according to the voice settings; and transmitting the voice signal to the second-side amplification device.

在本發明的一實施例中，上述的處理器響應於沒有從語音訊號中辨識出語意關鍵字而不調整語音設定。In one embodiment of the invention, the processor responds to the failure to identify semantic keywords from the speech signal without adjusting the speech settings.

在本發明的一實施例中，上述的處理器響應於從第一側收擴音裝置接收到語音訊號，判斷是否有從語音訊號中辨識出語意關鍵字；以及響應於從語音訊號中辨識出語意關鍵字，調整語音設定的頻率值和音量值的至少一者。In one embodiment of the present invention, the processor responds to receiving a speech signal from the first-side receiver/amplifier, determines whether a semantic keyword is identified from the speech signal, and, in response to identifying the semantic keyword from the speech signal, adjusts at least one of the frequency value and volume value of the speech setting.

在本發明的一實施例中，上述的處理器響應於判斷語音設定經過調整，將經調整的語音設定傳送至伺服器。In one embodiment of the invention, the processor responds to determining that the voice settings have been adjusted and sends the adjusted voice settings to the server.

在本發明的一實施例中，上述的伺服器儲存有機器學習模型，上述的處理器響應於從語音訊號中辨識出語意關鍵字，將語意關鍵字和語音設定作為機器學習模型的輸入傳送至伺服器，以取得對應語音設定的建議值；以及根據建議值調整語音設定的頻率值和音量值的至少一者。In one embodiment of the present invention, the server stores a machine learning model, and the processor responds by identifying semantic keywords from the speech signal, transmitting the semantic keywords and speech settings as inputs to the machine learning model to the server to obtain suggested values for the corresponding speech settings; and adjusting at least one of the frequency and volume values of the speech settings according to the suggested values.

在本發明的一實施例中，上述的處理器將所有辨識出的語意關鍵字和語音設定的調整歷程建立關聯而作為訓練資料傳送至伺服器，其中伺服器根據訓練資料訓練或更新機器學習模型。In one embodiment of the present invention, the processor associates all identified semantic keywords with the voice setting adjustment process and sends them to the server as training data, wherein the server trains or updates the machine learning model based on the training data.

在本發明的一實施例中，上述的處理器當將客戶的身分驗證資訊傳送至伺服器而伺服器通知不存在對應客戶的語音設定時，將通過分析語音訊號得到的頻率值和音量值設為語音設定的頻率值和音量值。In one embodiment of the present invention, when the processor sends the customer's identity verification information to the server and the server notifies that there is no corresponding voice setting for the customer, the processor sets the frequency and volume values obtained by analyzing the voice signal as the frequency and volume values of the voice setting.

在本發明的一實施例中，上述的處理器對語音訊號進行降噪而傳送至第二側收擴音裝置，以及對從第二側收擴音裝置接收的語音訊號進行降噪而傳送至第一側收擴音裝置。In one embodiment of the present invention, the processor performs noise reduction on the speech signal and transmits it to the second-side amplification device, and performs noise reduction on the speech signal received from the second-side amplification device and transmits it to the first-side amplification device.

在本發明的一實施例中，上述的語音服務系統還包括自動櫃員機，且上述的伺服器當客戶操作自動櫃員機時，通過自動櫃員機對客戶執行身分驗證程序，並且響應於身分驗證程序成功而將對應客戶的語音設定傳送至自動櫃員機。In one embodiment of the present invention, the above-mentioned voice service system further includes an ATM, and when a customer operates the ATM, the server performs an identity verification procedure on the customer through the ATM, and in response to the successful identity verification procedure, transmits the corresponding voice settings of the customer to the ATM.

本發明的一種具客戶識別的語音服務方法，包含下列步驟：將客戶的身分驗證資訊傳送至伺服器，以自伺服器取得對應客戶的語音設定；響應於從第一側收擴音裝置接收的語音訊號中辨識出語意關鍵字，調整語音設定的頻率值和音量值的至少一者；根據語音設定對語音訊號的頻率和音量進行調整；以及將語音訊號傳送至第二側收擴音裝置。The present invention provides a voice service method with customer identification, comprising the following steps: transmitting customer identity verification information to a server to obtain the corresponding customer's voice settings from the server; identifying semantic keywords in response to a voice signal received from a first-side receiver/amplifier and adjusting at least one of the frequency and volume values of the voice settings; adjusting the frequency and volume of the voice signal according to the voice settings; and transmitting the voice signal to a second-side receiver/amplifier.

基於上述，本發明的具客戶識別之語音服務系統和其方法，可基於客戶針對服務人員的語音的反應，調整服務人員的語音的頻率和音量，使客戶與服務人員能夠順利溝通，因而提升客戶的服務體驗和節省辦理業務所需耗費的時間。另一方面，對於辦理業務過的客戶，可紀錄適合客戶的語音設定，因此在客戶下次辦理業務時可更有效率地調整服務人員的頻率和音量，進一步提昇客戶的服務體驗。Based on the above, the customer-identified voice service system and method of this invention can adjust the frequency and volume of the service personnel's voice based on the customer's response to the service personnel's voice, enabling smooth communication between the customer and the service personnel, thereby improving the customer's service experience and saving the time required to complete the transaction. On the other hand, for customers who have previously completed transactions, suitable voice settings can be recorded, so that the frequency and volume of the service personnel can be adjusted more efficiently when the customer completes transactions again, further enhancing the customer's service experience.

圖1是根據本發明的實施例繪示的具客戶識別之語音服務系統10的示意圖。在本實施例中，語音服務系統10可包含語音服務裝置100及伺服器200，語音服務裝置100可包含處理器110、儲存媒體120、收發器130、第一側收擴音裝置141以及第二側收擴音裝置142，其中處理器110可耦接至儲存媒體120、收發器130、第一側收擴音裝置141以及第二側收擴音裝置142。伺服器200可包含處理器210、儲存媒體220、收發器230，其中處理器210可耦接至儲存媒體220以及收發器230。Figure 1 is a schematic diagram of a voice service system 10 with customer identification according to an embodiment of the present invention. In this embodiment, the voice service system 10 may include a voice service device 100 and a server 200. The voice service device 100 may include a processor 110, a storage medium 120, a transceiver 130, a first-side amplification device 141, and a second-side amplification device 142, wherein the processor 110 may be coupled to the storage medium 120, the transceiver 130, the first-side amplification device 141, and the second-side amplification device 142. Server 200 may include processor 210, storage medium 220, and transceiver 230, wherein processor 210 may be coupled to storage medium 220 and transceiver 230.

處理器110或210可例如是中央處理單元（central processing unit，CPU），或是數位信號處理器（digital signal processor，DSP）、其他可程式化之一般用途或特殊用途的微處理器、可程式化控制器、特殊應用積體電路（application specific integrated circuit，ASIC）或其他類似元件或上述元件的組合。Processor 110 or 210 may be, for example, a central processing unit (CPU), a digital signal processor (DSP), other programmable general-purpose or special-purpose microprocessors, programmable controllers, application-specific integrated circuits (ASICs), or other similar components or combinations thereof.

儲存媒體120或220分別用以儲存可由處理器110或210運行時所需的各項軟體、資料及各類程式碼。儲存媒體120例如是任何型態的固定式或可移動式的隨機存取記憶體、唯讀記憶體、快閃記憶體、硬碟、固態硬碟或其他類似元件或上述元件的組合，本發明不限於此。在本實施例中，儲存媒體120可例如用以儲存從第一側收擴音裝置141以及第二側收擴音裝置142所接收的語音訊號以及其他所需的各項軟體、資料及各類程式碼。儲存媒體220可例如用以儲存包含對應客戶的語音設定的多個語音設定、用以輸出調整對應客戶的語音設定的建議值的機器學習模型以及其他所需的各項軟體、資料及各類程式碼。Storage media 120 or 220 are respectively used to store various software, data, and code required for operation by processor 110 or 210. Storage media 120 is, for example, any type of fixed or removable random access memory, read-only memory, flash memory, hard disk, solid-state drive, or other similar components or combinations thereof, and the invention is not limited thereto. In this embodiment, storage media 120 may, for example, be used to store voice signals received from the first-side receiver 141 and the second-side receiver 142, as well as other required software, data, and code. Storage media 220 can, for example, store multiple voice settings containing corresponding customer voice settings, a machine learning model for outputting suggested values for adjusting corresponding customer voice settings, and other necessary software, data, and various types of code.

收發器130或230以無線或有線的方式傳送及接收訊號，還可以執行例如低噪聲放大、阻抗匹配、混頻、向上或向下頻率轉換、濾波、放大以及類似的操作。Transceiver 130 or 230 transmits and receives signals wirelessly or via wire, and can also perform operations such as low-noise amplification, impedance matching, mixing, up or down frequency conversion, filtering, amplification, and similar functions.

第一側收擴音裝置141和第二側收擴音裝置142可例如是麥克風和揚聲器的組合，也可例如是各種音訊接收裝置和各種音訊播放裝置的組合。在本實施例中，第一側收擴音裝置141用以接收服務人員的語音訊號以及播放客戶的語音訊號，第二側收擴音裝置142用以接收客戶的語音訊號以及播放服務人員的語音訊號。The first-side receiver 141 and the second-side receiver 142 may be, for example, a combination of a microphone and a speaker, or a combination of various audio receiving devices and various audio playback devices. In this embodiment, the first-side receiver 141 is used to receive the voice signals of the service personnel and play the voice signals of the customers, and the second-side receiver 142 is used to receive the voice signals of the customers and play the voice signals of the service personnel.

圖2是根據本發明的實施例繪示的應用於具客戶識別之語音服務系統10的方法的流程圖。Figure 2 is a flowchart illustrating a method for applying a voice service system 10 with customer identification, according to an embodiment of the present invention.

在步驟S210中，具客戶識別之語音服務系統10的語音服務裝置100的處理器110將客戶的身分驗證資訊傳送至伺服器200，以自伺服器200取得對應客戶的語音設定。舉例來說，對應客戶的語音設定可例如是包含客戶能夠清楚聽到說話內容的頻率值和音量值。In step S210, the processor 110 of the voice service device 100 of the voice service system 10 with customer identification transmits the customer's identity verification information to the server 200 to obtain the corresponding voice settings from the server 200. For example, the corresponding voice settings may include frequency and volume values that allow the customer to hear the spoken content clearly.

在步驟S220中，處理器110響應於從第一側收擴音裝置141接收到語音訊號，判斷是否有從語音訊號中辨識出語意關鍵字。具體而言，語音服務裝置100的儲存媒體120可例如儲存能夠辨識對應語音訊號的說話內容的自然語言處理程序或者語言模型，處理器110可通過將語音訊號輸入自然語言處理程序或者語言模型，以取得代表需要調整語音訊號的語意關鍵字。舉例來說，客戶可能因為年齡的衰老或者疾病等因素造成聽力的衰退，導致在服務人員和客戶辦理業務的期間，客戶可能會反應聽不清楚服務人員說話的內容，或者是反應服務人員的聲音太小聲。對此，服務人員可例如說出「客戶反應聽不清楚」、「客戶反應太小聲」或者其他具有相似語意的、代表需要調整語音訊號的語意關鍵字。若處理器110判斷有從語音訊號中辨識出代表需要調整語音訊號的語意關鍵字，則可執行步驟S230。In step S220, the processor 110 responds to receiving a speech signal from the first-side amplification device 141 and determines whether semantic keywords are identified from the speech signal. Specifically, the storage medium 120 of the speech service device 100 may, for example, store a natural language processing program or language model capable of recognizing the utterance content corresponding to the speech signal. The processor 110 can obtain semantic keywords representing the speech signal that need to be adjusted by inputting the speech signal into the natural language processing program or language model. For example, a customer may experience hearing loss due to aging or illness, leading to complaints that they cannot clearly hear the service personnel or that the staff's voice is too soft during transactions. In response, the service personnel can say, for example, "The customer says they cannot hear clearly," "The customer is speaking too softly," or other similar semantic keywords indicating a need to adjust the voice signal. If the processor 110 determines that it has identified semantic keywords in the voice signal indicating a need for adjustment, it can execute step S230.

若處理器110在步驟S220中，判斷沒有從語音訊號中辨識出語意關鍵字，代表通過當前語音設定的頻率值和音量值所調整的語音訊號可以讓客戶能聽清楚服務人員說話的內容，則不調整語音設定。在此情況下，可執行步驟S250。If, in step S220, the processor 110 determines that no semantic keywords have been identified from the voice signal, meaning that the voice signal adjusted by the current voice setting frequency and volume values is sufficient for the customer to clearly hear what the service personnel are saying, then the voice settings will not be adjusted. In this case, step S250 can be executed.

在步驟S230中，處理器110將語意關鍵字和語音設定作為機器學習模型的輸入傳送至伺服器200，以取得對應語音設定的建議值。具體來說，伺服器200儲存有機器學習模型，機器學習模型可例如是接收語意關鍵字和語音設定的當前頻率值和音量值作為輸入，而輸出包含頻率建議值和音量建議值的建議值，建議值可用以適當的調整語音設定。機器學習模型可例如是基於生成型預訓練變換模型3（Generative Pre-trained Transformer 3，GPT-3）或長短期記憶（Long Short-Term Memory，LSTM）的神經網路模型，但不限於此。In step S230, processor 110 sends semantic keywords and voice settings as inputs to the machine learning model to server 200 to obtain suggested values for the corresponding voice settings. Specifically, server 200 stores a machine learning model that may, for example, receive the current frequency and volume values of semantic keywords and voice settings as inputs, and output suggested values including frequency and volume suggestions, which can be used to appropriately adjust the voice settings. The machine learning model may be, for example, a neural network model based on Generative Pre-trained Transformer 3 (GPT-3) or Long Short-Term Memory (LSTM), but is not limited to these.

在步驟S240中，處理器110根據建議值調整語音設定的頻率值和音量值的至少一者。舉例來說，當頻率建議值和目前語音設定的頻率值不同時，處理器110根據頻率建議值調整語音設定的頻率值。當音量建議值和目前語音設定的音量值不同時，處理器110根據音量建議值調整語音設定的音量值。In step S240, processor 110 adjusts at least one of the frequency and volume values of the voice setting based on suggested values. For example, when the suggested frequency value differs from the currently set frequency value, processor 110 adjusts the frequency value of the voice setting based on the suggested frequency value. When the suggested volume value differs from the currently set volume value, processor 110 adjusts the volume value of the voice setting based on the suggested volume value.

在步驟S250中，處理器110根據語音設定對語音訊號的頻率和音量進行調整。詳細而言，處理器110會根據語音設定調整從第一側收擴音裝置141接收的、服務人員的語音訊號的頻率和音量。如此一來，使客戶能夠容易聽清楚服務人員說話的內容。In step S250, processor 110 adjusts the frequency and volume of the voice signal according to the voice settings. Specifically, processor 110 adjusts the frequency and volume of the service personnel's voice signal received from the first-side receiver/amplifier 141 according to the voice settings. This makes it easier for customers to hear what the service personnel are saying.

在一實施例中，當處理器110將客戶的身分驗證資訊傳送至伺服器200而伺服器200通知不存在對應客戶的語音設定時，將通過分析從第一側收擴音裝置141接收的、服務人員的語音訊號得到的頻率值和音量值設為語音設定的頻率值和音量值。如此一來，可對應第一次辦理業務的客戶。In one embodiment, when processor 110 sends customer authentication information to server 200 and server 200 notifies that there is no corresponding voice setting for the customer, the frequency and volume values obtained by analyzing the voice signal received from the first-side receiver/amplifier 141 are set as the frequency and volume values of the voice setting. This allows for the matching of the frequency and volume values for customers making their first transaction.

在一實施例中，處理器110判斷語音設定的頻率值和音量值和通過分析從第一側收擴音裝置141接收的、服務人員的語音訊號得到的頻率值和音量值相似的情況下，不對語音訊號的頻率和音量進行調整。舉例來說，在客戶聽力沒有損失的情況下，可以省略對語音訊號的頻率和音量的調整。In one embodiment, if the processor 110 determines that the frequency and volume values set for the voice signal are similar to those obtained by analyzing the voice signal received from the first-side receiver/amplifier 141 from the service personnel, it will not adjust the frequency and volume of the voice signal. For example, if the customer has no hearing loss, the adjustment of the frequency and volume of the voice signal can be omitted.

在步驟S260中，處理器110對語音訊號進行降噪。處理器110可例如採用主動式降噪技術（Active Noise Cancellation，ANC）對語音訊號進行降噪。詳細而言，處理器110可對從第一側收擴音裝置141接收的、服務人員的語音訊號進行降噪而傳送至第二側收擴音裝置142，以及對從第二側收擴音裝置142接收的、客戶的語音訊號進行降噪而傳送至第一側收擴音裝置141。如此一來，可避免櫃檯附近的吵雜聲影響到客戶辦理業務。In step S260, processor 110 performs noise reduction on the voice signal. Processor 110 may, for example, employ active noise cancellation (ANC) technology to reduce noise in the voice signal. Specifically, processor 110 may reduce the noise in the voice signal of a service personnel received from the first-side receiver 141 before transmitting it to the second-side receiver 142, and may reduce the noise in the voice signal of a customer received from the second-side receiver 142 before transmitting it to the first-side receiver 141. This prevents noise near the counter from affecting customer business.

在步驟S270中，處理器110將語音訊號傳送至第二側收擴音裝置142。以整體而言，語音服務裝置100將從第一側收擴音裝置141接收的、服務人員的語音訊號進行頻率和音量的適當調整以及降噪處理，使得客戶能夠從第二側收擴音裝置142清楚地聽到語音訊號的內容。In step S270, processor 110 transmits the voice signal to the second-side receiver 142. Overall, the voice service device 100 adjusts the frequency and volume of the service personnel's voice signal received from the first-side receiver 141 and performs noise reduction processing so that the customer can clearly hear the content of the voice signal from the second-side receiver 142.

在步驟S280中，處理器110響應於判斷語音設定經過調整，將經調整的語音設定傳送至伺服器200。舉例來說，經調整的語音設定包含了適合客戶的頻率值和音量值，客戶在下次辦理業務時，處理器110可自伺服器200取得適合客戶的語音設定的頻率值和音量值，以用來調整從第一側收擴音裝置141接收到的語音訊號。如此一來，可更有效率地調整語音訊號。In step S280, the processor 110 responds to the determination that the voice settings have been adjusted and sends the adjusted voice settings to the server 200. For example, the adjusted voice settings include frequency and volume values suitable for the customer. When the customer conducts business again, the processor 110 can obtain the frequency and volume values suitable for the customer's voice settings from the server 200 to adjust the voice signal received from the first-side receiver/amplifier 141. In this way, the voice signal can be adjusted more efficiently.

在一實施例中，語音服務系統10還包括自動櫃員機，且伺服器200當客戶操作自動櫃員機時，通過自動櫃員機對客戶執行身分驗證程序，並且響應於身分驗證程序成功而將對應客戶的語音設定傳送至自動櫃員機。如此一來，使得客戶能夠清楚地聽到從自動櫃員機播放的語音訊號的內容。In one embodiment, the voice service system 10 further includes an ATM, and when a customer operates the ATM, the server 200 performs an identity verification procedure on the customer through the ATM, and in response to successful identity verification, transmits the corresponding voice settings to the ATM. This allows the customer to clearly hear the content of the voice signal played from the ATM.

在一實施例中，第一側收擴音裝置141可例如是服務人員端的電話裝置。如此一來，使得客戶以電話進行客服服務時，也能夠從電話清楚地聽到從第一側收擴音裝置141接收的語音訊號的內容。In one embodiment, the first-side amplification device 141 may be, for example, a telephone device on the service personnel's end. In this way, when a customer makes a customer service call, they can clearly hear the content of the voice signal received from the first-side amplification device 141.

在步驟S290中，處理器110將所有辨識出的語意關鍵字和語音設定的調整歷程建立關聯而作為訓練資料傳送至伺服器200。如此一來，伺服器200可根據包含了客戶實際反應的訓練資料來訓練或更新機器學習模型，因而可確保機器學習模型輸出的建議值更符合客戶的反應內容，並且避免輸出已經被客戶反應聽不清楚的建議值。In step S290, processor 110 associates all identified semantic keywords with the voice setting adjustment process and sends this as training data to server 200. In this way, server 200 can train or update the machine learning model based on training data containing actual customer feedback, thus ensuring that the suggested values output by the machine learning model better match the customer's feedback and avoiding outputting suggested values that the customer has already indicated were unclear.

在一實施例中，處理器110將所有從第一側收擴音裝置141和第二側收擴音裝置142接收的語音訊號儲存至儲存媒體120，並且傳送至伺服器200，作為服務人員與客戶進行業務處理時之紀錄與證明，以減少可能的糾紛。In one embodiment, the processor 110 stores all voice signals received from the first-side receiver 141 and the second-side receiver 142 in the storage medium 120 and transmits them to the server 200 as a record and proof of business transactions between service personnel and customers, in order to reduce potential disputes.

圖3是根據本發明的實施例繪示的應用於具客戶識別之語音服務系統10的方法的流程圖。Figure 3 is a flowchart illustrating a method for applying a voice service system 10 with customer identification, according to an embodiment of the present invention.

在步驟S310中，將客戶的身分驗證資訊傳送至伺服器200，以自伺服器200取得對應客戶的語音設定。在步驟S320中，響應於從第一側收擴音裝置141接收的語音訊號中辨識出語意關鍵字，調整語音設定的頻率值和音量值的至少一者。在步驟S330中，根據語音設定對語音訊號的頻率和音量進行調整。在步驟S340中，將語音訊號傳送至第二側收擴音裝置142。In step S310, the customer's identity verification information is transmitted to server 200 to obtain the corresponding customer's voice settings from server 200. In step S320, in response to the voice signal received from the first-side receiver 141, semantic keywords are identified, and at least one of the frequency and volume values of the voice settings is adjusted. In step S330, the frequency and volume of the voice signal are adjusted according to the voice settings. In step S340, the voice signal is transmitted to the second-side receiver 142.

綜上所述，本發明的具客戶識別之語音服務系統和其方法，可基於客戶針對服務人員的語音的反應，調整服務人員的語音的頻率和音量，使客戶與服務人員能夠順利溝通，因而提升客戶的服務體驗和節省辦理業務所需耗費的時間。另一方面，對於辦理業務過的客戶，可紀錄適合客戶的語音設定，因此在客戶下次辦理業務時可更有效率地調整服務人員的頻率和音量，進一步提昇客戶的服務體驗。In summary, the customer-identification voice service system and method of this invention can adjust the frequency and volume of the service personnel's voice based on the customer's response to the service personnel's voice, enabling smooth communication between the customer and the service personnel, thereby improving the customer's service experience and saving time spent on handling business. On the other hand, for customers who have previously conducted business, suitable voice settings can be recorded, so that the frequency and volume of the service personnel can be adjusted more efficiently when the customer conducts business again, further enhancing the customer's service experience.

10:語音服務系統 100:語音服務裝置 110、210:處理器 120、220:儲存媒體 130、230:收發器 141:第一側收擴音裝置 142:第二側收擴音裝置 200:伺服器 S210、S220、S230、S240、S250、S260、S270、S280、S290、S310、S320、S330、S340:步驟 10: Voice Service System 100: Voice Service Device 110, 210: Processor 120, 220: Storage Media 130, 230: Transceiver 141: First-Side Amplifier 142: Second-Side Amplifier 200: Server S210, S220, S230, S240, S250, S260, S270, S280, S290, S310, S320, S330, S340: Steps

圖1是根據本發明的實施例繪示的具客戶識別之語音服務系統的示意圖。圖2是根據本發明的實施例繪示的應用於具客戶識別之語音服務系統的方法的流程圖。圖3是根據本發明的實施例繪示的應用於具客戶識別之語音服務系統的方法的流程圖。 Figure 1 is a schematic diagram of a voice service system with customer identification, according to an embodiment of the present invention. Figure 2 is a flowchart of a method applied to the voice service system with customer identification, according to an embodiment of the present invention. Figure 3 is a flowchart of a method applied to the voice service system with customer identification, according to an embodiment of the present invention.

10:語音服務系統 10: Voice Service System

100:語音服務裝置 100: Voice Service Device

110、210:處理器 110, 210: Processor

120、220:儲存媒體 120, 220: Storage Media

130、230:收發器 130, 230: Transceivers

141:第一側收擴音裝置 141: First-side receiver/amplifier

142:第二側收擴音裝置 142: Second-side receiver/amplifier

200:伺服器 200: Server

Claims

A voice service system with customer identification includes: a server for storing voice settings; and a voice service device including: a first-side receiver/amplifier and a second-side receiver/amplifier; a transceiver communicatively connected to the server; a storage medium; and a processor coupled to the transceiver, the storage medium, the first-side receiver/amplifier, and the second-side receiver/amplifier, wherein the processor is configured to perform: transmitting customer authentication information to the server to retrieve voice settings corresponding to the customer from the server; In response to identifying semantic keywords in a voice signal received from a service provider by the first-side receiver/amplifier, adjusting at least one of the frequency and volume values of the voice settings; adjusting the frequency and volume of the voice signal according to the voice settings; and transmitting the voice signal to the second-side receiver/amplifier.

The voice service system as described in claim 1, wherein the processor is further configured to respond to the failure to identify semantic keywords from the voice signal without adjusting the voice settings.

The voice service system as described in claim 1, wherein the processor is further configured to: in response to receiving a voice signal from the first-side receiver/amplifier, determine whether a semantic keyword is identified from the voice signal; and in response to identifying a semantic keyword from the voice signal, adjust at least one of the frequency value and volume value of the voice setting.

The voice service system as described in claim 1, wherein the processor is further configured to respond to determining that the voice settings have been adjusted and to transmit the adjusted voice settings to the server.

The voice service system of claim 1, wherein the server stores a machine learning model, and the processor is further configured to: recognize the semantic keywords from the voice signal, transmit the semantic keywords and the voice settings as inputs to the machine learning model to the server to obtain suggested values corresponding to the voice settings; and adjust at least one of the frequency value and the volume value of the voice settings according to the suggested values.

The voice service system as described in claim 5, wherein the processor is further configured to associate all identified semantic keywords with the voice setting adjustment process as training data and transmit them to the server, wherein the server trains or updates the machine learning model based on the training data.

The voice service system as claimed in claim 1, wherein the processor is further configured to, when transmitting the customer's authentication information to the server and the server notifies that there is no corresponding voice setting for the customer, set the frequency value and volume value obtained by analyzing the voice signal to the frequency value and volume value of the voice setting.

The voice service system as claimed in claim 1, wherein the processor is further configured to denoise the voice signal before transmitting it to the second-side amplification device, and to denoise the voice signal received from the second-side amplification device before transmitting it to the first-side amplification device.

The voice service system as described in claim 1, wherein the voice service system further includes an ATM, and the server is further configured to perform: When the customer operates the ATM, an identity verification procedure is performed on the customer through the ATM, and in response to successful identity verification, the corresponding voice settings for the customer are transmitted to the ATM.

A voice service method with customer identification includes: transmitting customer identity verification information to a server to obtain voice settings corresponding to the customer from the server; identifying semantic keywords in response to a service personnel's voice signal received from a first-side receiver/amplifier, and adjusting at least one of the frequency and volume values of the voice settings; adjusting the frequency and volume of the voice signal according to the voice settings; and transmitting the voice signal to a second-side receiver/amplifier.