RU2730449C2

RU2730449C2 - Method of creating model for analysing dialogues based on artificial intelligence for processing user requests and system using such model

Info

Publication number: RU2730449C2
Application number: RU2019102403A
Authority: RU
Inventors: Денис Олегович Антюхов; Леонид Петрович Пугачёв
Original assignee: Публичное Акционерное Общество "Сбербанк России" (Пао Сбербанк)
Priority date: 2019-01-29
Filing date: 2019-01-29
Publication date: 2020-08-21
Also published as: EA038264B1; EA201990216A1; RU2019102403A; RU2019102403A3; WO2020159395A1

Abstract

FIELD: data processing.

SUBSTANCE: invention relates to the field of data processing. Method of creating a model for analysing dialogues based on artificial intelligence for processing user requests comprises steps of obtaining a set of primary data, wherein the set includes at least text data of dialogues containing user calls and operator responses, processing the obtained data set, during which a training sample is formed for an artificial neural network, containing positive and negative examples of user requests based on analysis of the context of dialogues, wherein the positive examples comprise a semantically related set of operator replicas in response to the user's request, performing selection and encoding of the vector representations of each replica from the training sample examples mentioned at the previous step, formed training sample is used to teach model of determination of relevant replicas from context of user calls in dialogues.

EFFECT: technical result is wider range of means.

9 cl, 7 dwg

Description

ОБЛАСТЬ ТЕХНИКИFIELD OF TECHNOLOGY

[0001] Настоящее техническое решение, в общем, относится к области вычислительной обработки данных, а в частности, к методам машинного обучения для построения моделей анализа диалогов на естественном языке.[0001] The present technical solution, in General, relates to the field of computational data processing, and in particular, to machine learning methods for building dialogue analysis models in natural language.

УРОВЕНЬ ТЕХНИКИLEVEL OF TECHNOLOGY

[0002] В настоящее время системы автоматизированного распознавания естественного языка получили большое распространение в различных отраслях техники. Наиболее широкое применение данных технологий наблюдается в пользовательском секторе при использовании в различных программных приложениях, например, поисковиках, навигаторах, приложениях по подбору товаров и т.п., например, при использовании интеллектуальных ассистентов. Ключевой особенностью в работе таких интеллектуальных ассистентов является возможность точного распознавания речевых команд, формируемых пользователями.[0002] At present, automated natural language recognition systems have become widespread in various fields of technology. The widest application of these technologies is observed in the user sector when used in various software applications, for example, search engines, navigators, product selection applications, etc., for example, when using intelligent assistants. A key feature in the work of such intelligent assistants is the ability to accurately recognize speech commands generated by users.

[0003] Существующей сложностью является формирование моделей анализа речевых сообщений, которые с заданной точностью и скоростью позволяют быстро сформировать и предоставить ответ на запрос пользователя, особенно если речь идет о специализированной области их применения, что требует тщательной настройки и обучения такого рода моделей.[0003] The existing complexity is the formation of models for the analysis of speech messages, which with a given accuracy and speed allow you to quickly form and provide a response to a user's request, especially when it comes to a specialized field of their application, which requires careful tuning and training of such models.

[0004] На сегодняшний момент из уровня техники известно достаточно много подходов в области создания и обучения моделей для обработки естественного языка (англ. «NLP» Natural Language Processing). Известен принцип создания моделей с помощью алгоритма машинного обучения, который заключается в применения способа фильтрации предложений с помощью рекуррентной нейронной сети и алгоритма «Мешок слов» (англ. «Bag of words») (патентная заявка US 20180268298, заявитель: Salesforce.com Inc., опубликовано 20.09.2018). Известный подход раскрывает принцип сентиментного анализа с помощью применения двух типов моделей - простой и сложной, которые классифицируют получаемое сообщение на естественном языке. Недостатками известного подхода является низкая точность и скорость работы, что обусловлено применением нескольких моделей, выбираемых в зависимости от типа и сложности получаемого обращения.[0004] At the moment, from the prior art, there are many approaches in the field of creating and training models for natural language processing (English "NLP" Natural Language Processing). The principle of creating models using a machine learning algorithm is known, which consists in applying the method of filtering sentences using a recurrent neural network and the "Bag of words" algorithm (patent application US 20180268298, applicant: Salesforce.com Inc. , published on 20.09.2018). The well-known approach reveals the principle of sentiment analysis using two types of models - simple and complex, which classify the received message in natural language. The disadvantages of the known approach are low accuracy and speed of operation, which is due to the use of several models, selected depending on the type and complexity of the received request.

СУЩНОСТЬ ТЕХНИЧЕСКОГО РЕШЕНИЯESSENCE OF THE TECHNICAL SOLUTION

[0005] Заявленное техническое решение предлагает новый подход в области применения искусственного интеллекта (ИИ) с помощью создания моделей машинного обучения для обработки обращений пользователя на естественном языке.[0005] The claimed technical solution offers a new approach to the application of artificial intelligence (AI) by creating machine learning models for processing user requests in natural language.

[0006] Решаемой технической проблемой или технический задачей является создание нового способа создания модели анализа обращений на естественном языке, обладающей высокой степенью точности распознавания контекста обращения и скоростью обработки входящих обращений.[0006] The technical problem or technical problem to be solved is to create a new method for creating a model for analyzing calls in natural language, which has a high degree of recognition accuracy of the call context and the speed of processing incoming calls.

[0007] Основным техническим результатом, достигающимся при решении вышеуказанной технической проблемы, является создание модели анализа обращений пользователя на естественном языке, обладающей высокой точностью распознавания контекста обращений, за счет обеспечения возможности ранжирования ответов на поступающие обращения пользователей.[0007] The main technical result achieved when solving the above technical problem is the creation of a model for analyzing user requests in natural language, with high accuracy in recognizing the context of requests, due to the ability to rank responses to incoming requests from users.

[0008] Заявленный результат достигается за счет компьютерно-реализуемого способа создания модели анализа диалогов на базе искусственного интеллекта для обработки обращений пользователей, выполняемого с помощью по меньшей мере одного процессора и содержащего этапы, на которых:[0008] The claimed result is achieved by a computer-implemented method for creating a dialogue analysis model based on artificial intelligence for processing user requests, performed using at least one processor and containing the steps at which:

• получают набор первичных данных, причем набор включает в себя по меньшей мере текстовые данные диалогов между пользователями и операторами, содержащие обращения пользователей и ответы операторов;• receive a set of primary data, and the set includes at least text data of dialogs between users and operators, containing user requests and responses of operators;

• осуществляют обработку полученного набора данных, в ходе которой формируют обучающую выборку для искусственной нейронной сети, содержащую положительные и отрицательные примеры обращений пользователей на основании анализа контекста диалогов, причем положительные примеры содержат семантически связанный набор реплик оператора в ответ на обращение пользователя;• processing the received data set, during which a training sample is formed for an artificial neural network containing positive and negative examples of user calls based on the analysis of the context of dialogues, and the positive examples contain a semantically related set of operator replicas in response to the user call;

• выполняют выделение и кодирование векторных представлений каждой реплики из упомянутых на предыдущем шаге положительных и отрицательных примеров обучающей выборки;• perform the selection and coding of vector representations of each replica from the positive and negative examples of the training sample mentioned in the previous step;

• применяют сформированную обучающую выборку для обучения модели определения релевантных реплик из контекста пользовательских обращений в диалогах.• use the generated training sample to train the model for determining the relevant replicas from the context of user calls in dialogues.

[0009] В одном из частных вариантов осуществления способа модель представляет собой по меньшей мере одну искусственную нейронную сеть.[0009] In one particular embodiment of the method, the model is at least one artificial neural network.

[0010] В другом частном варианте осуществления способа положительные примеры формируются на основании законченных цепочек диалогов оператора с клиентом, причем такая цепочка содержит по меньшей мере одно вопросительное предложение.[0010] In another particular embodiment of the method, positive examples are formed on the basis of complete chains of dialogues between the operator and the client, and such a chain contains at least one interrogative sentence.

[0011] В другом частном варианте осуществления способа при подборе релевантных реплик для ответа на фразу обращения клиента на стадии обучения модели для каждой ответной реплики рассчитывается скоринговый балл.[0011] In another particular embodiment of the method, when selecting relevant replicas for responding to a customer's call phrase at the stage of model training, a scoring score is calculated for each response.

[0012] В другом частном варианте осуществления способа на этапе кодирования реплик в вектора реплики, представляющие предложения кодируются как матрица семантических векторов.[0012] In another particular embodiment of the method, in the step of encoding replicas into replica vectors, representing sentences are encoded as a matrix of semantic vectors.

[0013] Также указанный технический результат достигается за счет осуществления системы для обработки обращений пользователей в информационном канале с помощью искусственного интеллекта, которая содержит по меньшей мере один процессор; по меньшей мере одну память, соединенную с процессором, которая содержит машиночитаемые инструкции, которые при их выполнении по меньшей мере одним процессором обеспечивают: получение пользовательского обращения с помощью информационного канала; обработку пользовательского обращения с помощью модели машинного обучения для автоматизированной обработки обращений пользователей, созданной с помощью способа по вышеописанному способу; формирование и передачу в информационном канале ответного сообщения на обращение пользователя.[0013] Also, the specified technical result is achieved by implementing a system for processing user requests in the information channel using artificial intelligence, which contains at least one processor; at least one memory connected to the processor, which contains computer-readable instructions, which, when executed by at least one processor, provide: obtaining user access using the information channel; processing a user request using a machine learning model for automated processing of user requests created using the method according to the above method; formation and transmission in the information channel of a response message to the user's request.

[0014] В частном варианте реализации система представляет собой сервер, мейнфрейм, или суперкомпьютер.[0014] In a particular implementation, the system is a server, mainframe, or supercomputer.

[0015] В другом частном варианте реализации системы информационный канал представляет собой чат-сессию, VoIP связь, или канал телефонной связи.[0015] In another particular implementation of the system, the traffic channel is a chat session, VoIP communication, or a telephone channel.

[0016] В другом частном варианте реализации системы чат-сессия представляет собой чат с помощью мобильного приложения или чат на веб-сайте.[0016] In another particular implementation of the system, the chat session is a chat using a mobile application or a chat on a website.

ОПИСАНИЕ ЧЕРТЕЖЕЙDESCRIPTION OF DRAWINGS

[0017] Признаки и преимущества настоящего изобретения станут очевидными из приводимого ниже подробного описания изобретения и прилагаемых чертежей, на которых:[0017] Features and advantages of the present invention will become apparent from the following detailed description of the invention and the accompanying drawings, in which:

[0018] Фиг. 1 иллюстрирует блок-схему выполнения заявленного способа.[0018] FIG. 1 illustrates a block diagram of the implementation of the claimed method.

[0019] Фиг. 2 иллюстрирует пример обработки данных для формирования обучающей выборки.[0019] FIG. 2 illustrates an example of data processing to form a training sample.

[0020] Фиг. 3 иллюстрирует архитектуру модели определения вопросительных предложений.[0020] FIG. 3 illustrates the architecture of the interrogative sentence definition model.

[0021] Фиг. 4 иллюстрирует способ обучения модели определения релевантных реплик.[0021] FIG. 4 illustrates a method for training a model to identify relevant replicas.

[0022] Фиг. 5 иллюстрирует архитектуру модели определения релевантных реплик.[0022] FIG. 5 illustrates the architecture of the Relevant Replica Definition Model.

[0023] Фиг. 6 иллюстрирует пример применения обученной модели определения релевантных реплик.[0023] FIG. 6 illustrates an example of using a trained model for determining relevant replicas.

[0024] Фиг. 7 иллюстрирует общий вид заявленной системы.[0024] FIG. 7 illustrates a general view of the claimed system.

ОСУЩЕСТВЛЕНИЕ ИЗОБРЕТЕНИЯCARRYING OUT THE INVENTION

[0025] В данном техническом решении могут использоваться для ясности понимания работы такие термины как «оператор», «клиент», «сотрудник банка», которые в общем виде следует понимать, как «пользователь» системы.[0025] In this technical solution, terms such as "operator", "client", "bank employee" can be used for clarity of understanding of the work, which in general should be understood as a "user" of the system.

[0026] Заявленный способ (100) создания модели анализа диалогов на базе искусственного интеллекта для обработки запросов пользователей, как представлено на Фиг. 1, заключается в выполнении ряда последовательных этапов, осуществляемых процессором.[0026] The claimed method (100) for creating an AI-based dialog analysis model for processing user requests, as shown in FIG. 1 consists in performing a series of sequential steps performed by a processor.

[0027] Начальным шагом (101) для формирования модели анализа диалогов является получение первичного («сырого») набора данных, на которых будет строиться обучающая выборка для искусственной нейронной сети (ИНС). Набор первичных данных может представлять собой массив неразмеченных текстовых логов (записей) диалогов операторов с клиентами при обработке входящих обращений. Тематика текстовых логов может быть различной и меняться в зависимости от требований по итоговому формированию модели анализа для заданной отрасли ее итогового применения.[0027] The initial step (101) for generating a dialog analysis model is to obtain a primary ("raw") data set, on which a training sample for an artificial neural network (ANN) will be built. A set of primary data can be an array of unlabeled text logs (records) of dialogs between operators and clients when processing incoming calls. The subject of text logs can be different and vary depending on the requirements for the final formation of the analysis model for a given industry of its final application.

[0028] Под обращениями клиентов понимаются любые запросы, поступающие в информационные каналы взаимодействия с оператором контакт-центра или службы поддержки, например, финансово-кредитного учреждения. Как правило, первичные данные представляют собой преобразованные в текстовый вид записи разговоров клиентов с операторами. Информационные каналы для получения данных разговоров с операторами могут представлять собой, не ограничиваясь, телефонный канал, VoIP канал, чат-сессию на веб-сайте или в мобильном банковском приложении, чат-бот мессенджера и т.п. Любой тип канала, с помощью которого клиент может осуществлять диалог с оператором, может применяться для получения данных диалогов для последующего их перевода в текстовую форму для целей осуществления процесса обучения модели машинного обучения.[0028] Customer inquiries are understood as any requests coming to information channels of interaction with an operator of a contact center or support service, for example, a financial institution. As a rule, the primary data are records of conversations between customers and operators, converted into text. Information channels for receiving data from conversations with operators can be, but are not limited to, a telephone channel, a VoIP channel, a chat session on a website or in a mobile banking application, a chat bot of a messenger, etc. Any type of channel through which a client can carry out a dialogue with an operator can be used to receive these dialogs for their subsequent translation into text form for the purpose of implementing the learning process of a machine learning model.

[0029] Как было указано выше, под оператором в настоящем решении может пониматься как человек, осуществляющий обработку целей обращения клиентов, так и программный алгоритм, например, чат-бот или интеллектуальный автоответчик, способный также предоставлять сведения для обработки клиентских обращений.[0029] As mentioned above, the operator in this solution can mean both a person performing processing of the purposes of customer requests, and a software algorithm, for example, a chat bot or an intelligent answering machine, capable of also providing information for processing customer requests.

[0030] На основании полученного массива первичных данных с логами на этапе (101) далее выполняется его обработка для формирования обучающей выборки (102) ИНС. Из полученного массива логов осуществляется формирование вопросно-ответных пар. Эта процедура включает в себя алгоритм сбора данных, использование модели определения вопросительного предложения и алгоритм формирования вопросно-ответных пар. С помощью модели определения вопросительных предложений происходит поиск вопросов клиента в текстовых логах. Следующая после найденного вопроса клиента реплика оператора (при условии, что она удовлетворяет ряду требований: не вопросительная, достаточно длинная, не содержит стоп-слов) считается ответом на этот запрос.[0030] Based on the obtained array of primary data with logs at step (101), it is further processed to form the ANN training sample (102). From the resulting array of logs, question-answer pairs are formed. This procedure includes an algorithm for collecting data, using a model for determining an interrogative sentence, and an algorithm for forming question-answer pairs. Using the model for determining interrogative sentences, the client's questions are searched in text logs. The operator's replica next after the found client's question (provided that it satisfies a number of requirements: not an interrogative, long enough, does not contain stop words) is considered the answer to this request.

[0031] Обработка массива данных диалогов операторов с клиентами заключается в формировании на основании вопросно-ответных пар положительных и отрицательных примеров. Как правило, такие примеры формируются следующего вида: (контекст беседы (2-5 реплик), обращение или запрос клиента, ответ оператора на запрос) - положительный пример; (контекст беседы (2-5 реплик), обращение или запрос клиента, ответ оператора на какой-нибудь другой запрос) - отрицательный пример.[0031] The processing of the data array of dialogs between operators and clients consists in forming on the basis of question-answer pairs of positive and negative examples. As a rule, such examples are formed as follows: (the context of the conversation (2-5 replicas), the client's request or request, the operator's response to the request) - a positive example; (the context of the conversation (2-5 replicas), the client's request or request, the operator's response to some other request) is a negative example.

[0032] Ниже приведены примеры осуществления вопросно-ответных пар.[0032] Below are examples of the implementation of question-answer pairs.

[0033] Пример 1 (положительный):[0033] Example 1 (positive):

ctx: ['Здравствуйте, Иван Иванович!', 'Чем могу помочь?', 'Здравствуйте могу ли оформить кредитную карту']ctx: ['Hello, Ivan Ivanovich!', 'How can I help?', 'Hello, can I get a credit card']

rsp: 'Уточнить условия по кредитным картам и подать заявку вы можете по ссылке: http://www.sberbank.ru/moscow/ru/person/bank_cards/credit/'rsp: 'You can clarify the conditions for credit cards and apply at the link: http://www.sberbank.ru/moscow/ru/person/bank_cards/credit/'

[0034] Пример 2 (отрицательный):[0034] Example 2 (negative):

ctx: ['Здравствуйте, Кирилл!', 'Чем могу вам помочь?', 'Здравствуйте, как подключить услугу мобильный банк?']ctx: ['Hello, Kirill!', 'How can I help you?', 'Hello, how to activate the mobile banking service?']

rsp:'Онлайн можно заказать только карты Visa Gold и MasterCard Gold'.rsp: 'Only Visa Gold and MasterCard Gold can be ordered online'.

[0035] Формирование вопросно-ответных пар для создания обучающей выборки для ИНС осуществляется с помощью модели анализа вопросительных предложений, которая необходима для правильного разбиения контекста логов и формирования тренировочного набора данных. Под контекстом в данном случае понимается упорядоченный по времени набор реплик оператора и клиента. Последней репликой в контексте всегда является вопрос клиента.[0035] The formation of question-answer pairs to create a training sample for the ANN is carried out using a model of analysis of interrogative sentences, which is necessary for the correct partitioning of the context of the logs and the formation of a training dataset. In this case, context refers to a time-ordered set of operator and client replicas. The last cue in context is always the customer's question.

[0036] На Фиг. 2 представлен пример применения модели определения вопросительных предложений для формирования на этапе (102) обучающей выборки для ИНС. Модель определения вопросительных предложений (220) представляет собой модель машинного обучения, например, искусственную нейронную сеть. Для обучения модели (220) может быть использован набор данных (иногда называют - «датасет») OpenSubtitles (OPUS) (http://opus.nlpl.eu/) (221), а также данные чатов с оператором (222). Датасет OPUS (221) представляет собой открытый набор данных, состоящий из субтитров к фильмам на различных языках, который используется как источник разговорной лексики, обычно встречающейся в художественном кино.[0036] FIG. 2 shows an example of using the model for determining interrogative sentences for the formation at stage (102) of a training sample for ANN. The interrogative sentence determination model (220) is a machine learning model, for example, an artificial neural network. To train the model (220), a dataset (sometimes called a "dataset") OpenSubtitles (OPUS) (http://opus.nlpl.eu/) (221), as well as data from chats with an operator (222) can be used. The OPUS dataset (221) is an open source dataset of subtitles for films in various languages that is used as a source of colloquial vocabulary commonly found in feature films.

[0037] В качестве положительных примеров из наборов данных (221) - (222) были выбраны все предложения, содержащие знак вопроса, в качестве отрицательных - все остальные предложения. В OPUS (221) все предложения заканчивающиеся знаком вопроса - вопросительные, так как пунктуация в субтитрах всегда правильная. Это относится и к вопросительным предложениям из данных чатов (222). Извлеченные таким образом вопросительные предложения также проходят дополнительную фильтрацию: отбрасываются короткие или содержащие стоп-слова предложения. Аналогично положительным примерам, большая часть предложений из OPUS (221), не содержащих знак вопроса - не вопросительные и выбираются в качестве отрицательных примеров. Дополнительно отбрасываются предложения содержащие вопросительные слова или слишком короткие, где заранее задан размер слова.[0037] All sentences containing a question mark were selected as positive examples from data sets (221) - (222), and all other sentences as negative ones. In OPUS (221), all sentences ending with a question mark are interrogative, since the punctuation in the subtitles is always correct. This also applies to interrogative sentences from these chats (222). The interrogative sentences extracted in this way also undergo additional filtering: short sentences or sentences containing stop words are discarded. Similar to positive examples, most of the sentences from OPUS (221) that do not contain a question mark are not interrogative and are chosen as negative examples. Additionally, sentences containing interrogative words or too short sentences where the word size is predefined are discarded.

[0038] Отрицательные примеры могут быть сгенерированы в любом количестве, что позволяет добиться любого соотношения положительных и отрицательных примеров в обучающей выборке для ИНС. Эксперименты показали, что наилучшего качества удается добиться, когда соотношение положительных и отрицательных примеров 1:1.[0038] Negative examples can be generated in any number, which allows you to achieve any ratio of positive and negative examples in the training set for ANN. Experiments have shown that the best quality is achieved when the ratio of positive and negative examples is 1: 1.

[0039] В примерном варианте осуществления обучающая выборка балансируется, вся пунктуация вырезается, чтобы модель (220) строила свои предсказания основываясь исключительно на семантике слов в предложении. Использовались все полученные сырые данные, но количество положительных и отрицательных примеров было одинаковым в каждом батче (фрагмент данных) при обучении модели (220). Для семантического анализа применяется семантическая модель слов fasttext и рекуррентная нейронная сеть на основе LSTM (англ. «Long short-term memory» - долгая краткосрочная память) для моделирования семантики предложения. FastText - это библиотека для изучения встраивания слов и классификации текста, созданная исследовательской лабораторией AI в Facebook™. Точность такой процедуры обработки набора данных составляет около 95%.[0039] In an exemplary embodiment, the training sample is balanced, all punctuation is cut out so that the model (220) makes its predictions based solely on the semantics of words in the sentence. All obtained raw data were used, but the number of positive and negative examples was the same in each batch (data fragment) when training the model (220). For semantic analysis, the semantic word model fasttext and a recurrent neural network based on LSTM (long short-term memory) are used to model the semantics of a sentence. FastText is a library for learning word embedding and text classification created by AI Research Lab at Facebook ™. The accuracy of this data set processing procedure is about 95%.

[0040] LSTM - тип рекуррентных нейронных сетей с обратной связью, который широко применяется в индустрии для моделирования временных рядов и других последовательностей. Наиболее широкое применение данная архитектура нашла в компьютерной лингвистике, где она применяется для моделирования семантики предложений или целых абзацев текста.[0040] LSTM is a type of closed-loop recurrent neural network that is widely used in the industry for modeling time series and other sequences. This architecture is most widely used in computational linguistics, where it is used to model the semantics of sentences or entire paragraphs of text.

[0041] На Фиг. 3 представлена архитектура модели определения вопросительных предложений (220). Архитектура модели определения вопросительных предложений (220) представлена на примере нейросетевой модели в виде ациклического вычислительного графа.[0041] FIG. 3 shows the architecture of the model for determining interrogative sentences (220). The architecture of the model for determining interrogative sentences (220) is presented on the example of a neural network model in the form of an acyclic computational graph.

[0042] На архитектуре модели (220) указаны примеры размерностей входного и выходного тензора для каждого блока.[0042] On the architecture of model (220), examples of the dimensions of the input and output tensor for each block are indicated.

Пример записи:Recording example:

Вход: (Нет,, 20)Input: (No ,, 20)

Выход: (Нет, 20, 300)Output: (No, 20, 300)

[0043] Данный пример означает, что блок на вход принимает тензор размерностью (batch_size, 20) и отдает тензор размерностью (batch_size, 20, 300). Размер батча (пакета данных) для обученной модели может быть любым (это влияет только на быстродействие и зависит от среды исполнения), для этого в нотации указывается (Нет.[0043] This example means that the block accepts a tensor of dimension (batch_size, 20) as input and returns a tensor of dimension (batch_size, 20, 300). The size of the batch (data packet) for the trained model can be any (this only affects performance and depends on the runtime environment), for this, the notation indicates (No.

[0044] Модель (220) содержит входной узел для текстовых данных (inp_ctx_0) (2201) и один выходной узел предсказания модели (relevance) (2211). В качестве пре-тренированных эмбеддингов (векторных представлений слов) используется модель FastText, содержащая тексты на русском языке. В качестве энкодера используется двунаправленный LSTM-модуль.[0044] The model (220) contains an input node for text data (inp_ctx_0) (2201) and one model prediction output node (relevance) (2211). The FastText model containing texts in Russian is used as pre-trained embeddings (vector representations of words). A bi-directional LSTM module is used as an encoder.

[0045] Модуль векторизации слов (2202) содержит предобученную модель (англ. «word embedding») для векторизации на уровне слов. Каждое из предложений, подаваемых на вход модели уже разбито на токены - представлено в виде списка слов. При этом все предложения представляются в виде последовательностей равной длины (это нужно для эффективной обработки батчей). Короткие предложения дополняются до этой фиксированной длины нулевым токеном, слишком длинные - обрезаются. Здесь и далее величина длины последовательностей будет обозначаться как MAX_LEN. В экспериментах использовалось значение MAX_LEN=24, однако не ограничиваясь.[0045] The word vectorization module (2202) contains a pretrained word embedding model for word level vectorization. Each of the sentences submitted to the input of the model is already broken down into tokens - presented as a list of words. At the same time, all sentences are presented as sequences of equal length (this is necessary for efficient batch processing). Short sentences are padded to this fixed length with a zero token, too long sentences are truncated. Hereinafter, the length of the sequences will be denoted as MAX_LEN. The experiments used the value MAX_LEN = 24, but not limited to.

[0046] Word embedding - векторное представление слова, полученное с помощью дистрибутивной модели языка (обычно программных инструменты анализа семантики естественных языков word2vec, fasttext или glove). Это вектор размерности порядка нескольких сотен (100-1000). Характерной особенностью является то, что похожие по смыслу слова представляются близкими (по Евклидовой метрике L2) векторами.[0046] Word embedding is a vector representation of a word obtained using a distributive language model (usually word2vec, fasttext, or glove natural language semantics analysis software tools). This is a vector of dimensions of the order of several hundred (100-1000). A characteristic feature is that words similar in meaning are represented by similar vectors (according to the Euclidean metric L2).

[0047] Каждому слову в соответствие ставится семантический вектор - т.н. word embedding (см. источник информации https://ru.wikipedia.org/wiki/Word2vec). Для этого используется обученная на тематическом, например, банковском наборе данных модель fasttext (см. https://arxiv.org/abs/1607.04606). Преимуществом модели fasttext над базовой word2vec является возможность обработки (векторизации) слов, отсутствовавших в обучающей выборке. Семантические векторы fasttext имеют размерность порядка нескольких сотен, эту размерность можно обозначить как EMB_DIM. В проведенных экспериментах с архитектурой представленной модели (220) использовался EMB_DIM=300.[0047] Each word is assigned a semantic vector - the so-called. word embedding (see information source https://ru.wikipedia.org/wiki/Word2vec). For this, a fasttext model trained on a thematic, for example, banking dataset, is used (see https://arxiv.org/abs/1607.04606). The advantage of the fasttext model over the basic word2vec is the ability to process (vectorize) words that were absent in the training set. Fasttext semantic vectors have a dimension of the order of several hundred, this dimension can be denoted as EMB_DIM. In the experiments carried out with the architecture of the presented model (220), EMB_DIM = 300 was used.

[0048] В результате этих процедур каждому предложению на входе модели (220) ставится в соответствие матрица размерностью (MAX_LEN, EMB_DIM). Эта функциональность инкапсулирована в модуль word_embedding_model (2202). Для векторизации предложений контекста и ответа используется модуль (2202). При этом модуль векторизации слов (2202) не обучается в процессе настройки модели (220), т.к. векторы слов в нем зафиксированы и более не изменяются.[0048] As a result of these procedures, each sentence at the input of the model (220) is associated with a matrix of dimensions (MAX_LEN, EMB_DIM). This functionality is encapsulated in the word_embedding_model module (2202). The module (2202) is used to vectorize the context and response sentences. In this case, the word vectorization module (2202) is not trained in the process of setting up the model (220), since word vectors in it are fixed and do not change anymore.

[0049] Модуль векторизации предложения (2203) содержит модель векторизации всего предложения. Каждое из предложений представляется в виде матрицы из (MAX_LEN, EMB_DIM) модуля (2202) и кодируется в вектор фиксированной размерности. Для этого применяется рекуррентная нейронная сеть типа LSTM. Матрица, полученная с помощью модуля (2202) обрабатывается слева-направо модулем LSTM, в качестве векторного представления предложения соответствует последнее внутреннее состояние LSTM (то есть, соответствующее последнему слову в предложении).[0049] The sentence vectorization module (2203) contains a model for vectorizing the entire sentence. Each of the sentences is represented as a matrix from (MAX_LEN, EMB_DIM) module (2202) and is encoded into a vector of fixed dimension. For this, a recurrent neural network of the LSTM type is used. The matrix obtained by the module (2202) is processed from left to right by the LSTM module, as the vector representation of the sentence corresponds to the last internal state of the LSTM (that is, corresponding to the last word in the sentence).

[0050] В результате работы модуля (2203) каждое предложение будет представлено в виде вектора фиксированной размерности LSTM_DIM. В качестве примера работы была использована размерность ячейки LSTM_DIM=340. Таким образом, контексту запроса, состоящему из CTX_LEN реплик максимум по SEQ LEN слов (вход inp_ctx) в соответствие ставится матрица (CTX_LEN, LSTM_DIM). Кандидату, состоящему из одного предложения, в соответствие ставится вектор LSTM_DIM. Кандидат представляет собой один из возможных вариантов ответа для данного контекста. Эта функциональность инкапсулирована в модуль векторизации предложений (2203). Модуль (2203) содержит большую часть обучаемых параметров модели (2-5 млн в зависимости от конфигурации) и является наиболее вычислительно «тяжелым».[0050] As a result of the operation of module (2203), each sentence will be represented in the form of a vector of fixed dimension LSTM_DIM. As an example of work, the cell dimension LSTM_DIM = 340 was used. Thus, the matrix (CTX_LEN, LSTM_DIM) is assigned to the query context, consisting of CTX_LEN replicas of maximum SEQ LEN words (input inp_ctx). The vector LSTM_DIM is assigned to a candidate consisting of one sentence. A candidate represents one of the possible responses for a given context. This functionality is encapsulated in the sentence vectorization module (2203). Module (2203) contains most of the model's training parameters (2-5 million depending on the configuration) and is the most computationally "heavy".

[0051] Модули субдискретизации (пулинга) (2204, 2205) получают на вход вектор фиксированной размерности из внутреннего состояния RNN модуля векторизации предложений (2203). В частном варианте осуществления модули (2204, 2205) могут являться частью модуля векторизации предложений (2203).[0051] Modules of subsampling (pooling) (2204, 2205) receive as input a vector of fixed dimension from the internal state RNN of the sentence vectorization module (2203). In a particular embodiment, the modules (2204, 2205) may be part of the sentence vectorization module (2203).

[0052] Модуль конкатенации (2206) предназначен для конкатенации векторов, получаемых от модулей (2204, 2205) в один, для их последующей передачи в многослойный перцептрон (2207) в виде единого вектора.[0052] The concatenation module (2206) is designed to concatenate the vectors obtained from the modules (2204, 2205) into one, for their subsequent transfer to the multilayer perceptron (2207) as a single vector.

[0053] Многослойный перцептрон (англ. «MLP/Multilayer perceptron») (2207), в частности, двуслойный, в котором полносвязные (англ. «Dense») слои перемежаются с регуляризационными (англ. «Dropout»). Значение Dropout для данного примера MLP=0.3. Dropout - способ регуляризации нейронных сетей, который служит для борьбы с переобучением (см. например, http://imlr.org/papers/volume15/srivastava14a/srivastava14a.pdf).[0053] Multilayer perceptron (English "MLP / Multilayer perceptron") (2207), in particular, two-layer, in which fully connected (English "Dense") layers interspersed with regularization (English "Dropout"). The Dropout value for this example MLP = 0.3. Dropout is a way to regularize neural networks to combat overfitting (see, for example, http://imlr.org/papers/volume15/srivastava14a/srivastava14a.pdf).

[0054] Модуль (2208) является выходным нейроном с сигмоидальной функцией модели определения вопросительных предложений активации (220) и содержит предсказание модели (220), выполненное на основании обработки входных текстовых данных. В представленном примере архитектура представленной модели (220) достигла 0.945 AUC (Area under the ROC Curve), 0.875 ACC (Accuracy/точность) на валидационной выборке. Площадь под ROC-кривой AUC (Area under the ROC Curve) является агрегированной характеристикой качества классификации, не зависящей от соотношения цен ошибок. Чем больше значение AUC, тем «лучше» модель классификации. Данный показатель часто используется для сравнительного анализа нескольких моделей классификации.[0054] The module (2208) is an output neuron with a sigmoidal function of the model for determining interrogative sentences of activation (220) and contains the prediction of the model (220), made based on the processing of the input text data. In the example presented, the architecture of the presented model (220) reached 0.945 AUC (Area under the ROC Curve), 0.875 ACC (Accuracy) on the validation set. The area under the ROC Curve (AUC) is an aggregated characteristic of the quality of the classification that does not depend on the price ratio of errors. The higher the AUC value, the "better" the classification model. This indicator is often used for comparative analysis of several classification models.

[0055] Далее рассмотрим этап генерирования обучающей выборки (102) для модели анализа обращений. Генерация обучающей выборки (102) выполняется с помощью модели (220) выделения вопросительных предложений из входного набора данных (210), который представляет собой неразмеченные диалоги чатов между клиентами и операторами (210). Каждая реплика клиента в каждом чате обрабатывается с помощью упомянутой модели (220).[0055] Next, consider the step of generating a training sample (102) for the reference analysis model. The generation of the training sample (102) is performed using a model (220) for extracting interrogative sentences from the input dataset (210), which represents unlabeled chat dialogs between clients and operators (210). Each customer replica in each chat is processed using the mentioned model (220).

[0056] На этапе (103) реплики токенизируются и представляются как последовательность векторов слов, после чего подаются на вход модели (220) выделения вопросительных предложений. В ходе обработки реплик модель (220) оценивает вероятность того, что реплика является вопросительным предложением. В случае если после реплики-запроса следуют несколько сообщений оператора - формируется положительный обучающий пример (231). В случае если подряд идет несколько реплик клиента, в качестве вопросительной выбирается та, для которой предсказанная вероятность оказалась выше всего. Если же подряд идет несколько реплик-ответов оператора в ответ на запрос клиента, то обучающий пример формируется с каждой из них.[0056] At step (103), the replicas are tokenized and represented as a sequence of word vectors, after which they are fed to the input of the interrogative sentence extraction model (220). During replica processing, model (220) estimates the probability that the replica is an interrogative sentence. If several operator messages follow the replica request, a positive training example is formed (231). If there are several replicas of the client in a row, the one for which the predicted probability is the highest is chosen as the interrogative. If there are several replica-answers of the operator in a row in response to a client's request, then a training example is formed with each of them.

[0057] В положительный обучающий пример (231) в качестве контекста включаются все реплики вплоть до запроса клиента (включительно). В качестве ответа используется следующая реплика оператора. В контекст включаются последние n реплик (как клиента, так и оператора) предшествующие запросу клиента, где n - параметр модели, например, от 1 до 6. В примерном варианте реализации в ходе обработки набора данных (210) моделью (220) был сформирован обучающий набор, который содержал порядка 1000000 положительных примеров (231).[0057] The positive training example (231) includes all replicas up to and including the client request as context. The following operator's replica is used as a response. The context includes the last n replicas (of both the client and the operator) preceding the client's request, where n is a model parameter, for example, from 1 to 6. In an exemplary implementation, during the processing of the dataset (210) by the model (220), a training a set that contained about 1,000,000 positive examples (231).

[0058] Отрицательные примеры (232) были сформированы путем замены в положительном примере правильного ответа на произвольный из множества всех возможных ответов оператора (которых на момент обучения было около 1000000).[0058] Negative examples (232) were formed by replacing the correct answer in the positive example with an arbitrary one from the set of all possible operator responses (which at the time of training were about 1,000,000).

[0059] На основании сформированной обучающей выборки на этапе (104) осуществляют обучение модели определения релевантных реплик (240). На Фиг. 4 представлен пример обучения модели определения релевантных реплик (240). На вход модели (240) поступают данные обучающей выборки (230), сформированной на основании полученных положительных (231) и отрицательных (232) примеров обработки обращений клиентов.[0059] Based on the generated training sample at step (104), the model for determining the relevant replicas (240) is trained. FIG. 4 shows an example of training a model for determining relevant replicas (240). The input of the model (240) receives data from the training sample (230), formed on the basis of the obtained positive (231) and negative (232) examples of processing customer requests.

[0060] Из тренировочной части обучающей выборки (230) формируются обучающие батчи. Соотношение положительных и отрицательных примеров в батче выбирается приблизительно равным. Типичный размер батча - 256, 512. Модель (240) обучается в течении 32 эпох, в конце каждой валидируясь на отложенной выборке. Отложенная выборка представляет собой часть датасета, не используемую при обучении модели, но которая применяется для ее валидации (расчета метрик). Как пример, отложенная выборка может составлять 10% от исходной сгенерированной обучающей выборки.[0060] From the training part of the training sample (230), training batches are formed. The ratio of positive and negative examples in the batch is chosen to be approximately equal. The typical batch size is 256, 512. The model (240) is trained for 32 epochs, at the end of each it is validated on a deferred sample. Lazy sampling is a part of the dataset that is not used when training a model, but which is used to validate it (calculate metrics). As an example, the lazy sample can be 10% of the original generated training sample.

[0061] Опционально, при процессе валидации, в дополнение к обучающей выборке (230), полученной в автоматическом режиме из «сырых» (другими словами неразмеченных) данных, может быть использован размеченный вручную набор данных из вопросов и ответов. Если вручную размеченный набор данных достаточно большой (тысячи пар вопрос-ответ), то таким образом можно дополнительно дообучить модель (240) на этих парах. В этом случае выполняется замещение обучающей выборки (230) на размеченный вручную набор данных, с помощью которого продолжается дальнейшее обучение модели (240). Это приводит к существенному росту метрик качества на вопросах из дополнительного набора данных.[0061] Optionally, in the validation process, in addition to the training set (230) obtained automatically from the raw (in other words, unlabeled) data, a manually tagged Q&A dataset can be used. If the manually labeled dataset is large enough (thousands of question-answer pairs), then in this way it is possible to additionally train the model (240) on these pairs. In this case, the training sample (230) is replaced with a manually labeled dataset, with the help of which further training of the model (240) is continued. This leads to a significant increase in quality metrics on questions from an additional dataset.

[0062] Если данных немного, то выполняется валидация модели (240) на них, с помощью вычисления соответствующих метрик качества. Как правило, для вопросно-ответной системы, включающей модель (240), рассчитываются метрики recall@k и precision@k - модель с максимальными значениями этих метрик можно выбрать для последующей сериализации в pickle (модуль pickle реализует алгоритм сериализации и десериализации объектов Python). Значение данной метрики определяется частотой попадания верного ответа на вопрос в топ-K ранжированных по релевантности ответов модели. Это значение вычисляется по формуле: (количество релевантных ответов вплоть до k-той позиции в ранжированном списке ответов) / (общее количество релевантных ответов)[0062] If there is little data, then the model (240) is validated on them by calculating the appropriate quality metrics. As a rule, recall @ k and precision @ k metrics are calculated for a question-answer system including model (240) - the model with the maximum values of these metrics can be selected for subsequent serialization in pickle (the pickle module implements the algorithm for serializing and deserializing Python objects). The value of this metric is determined by the frequency of getting the correct answer to a question in the top-K of the model's answers ranked by relevance. This value is calculated by the formula: (number of relevant responses up to the kth position in the ranked list of responses) / (total number of relevant responses)

Например: модели задали 10 вопросов, 5 раз верный ответ был первым в списке сортированных ответов, и 8 раз верный ответ вошел в топ-3 сортированных по релевантности ответов. В таком случае для такого теста гесаll@1=5/10=0.5, гесаll@3=8/10 (считая что существует единственный релевантный ответ на каждый вопрос).For example: the models asked 10 questions, 5 times correct answer was first in the list of sorted answers, and 8 times correct answer entered the top 3 sorted by relevance answers. In this case, for such a test gesall@1=5/10=0.5, gesall @ 3 = 8/10 (assuming that there is only one relevant answer for each question).

[0063] Обучение модели (240), в среднем, занимает 2-3 часа в зависимости при использовании графического ускорителя GPU NVIDIA 1080Ti. Модель (240) с максимальным значением точности (accuracy) на отложенной выборке сериализуется в бинарный формат (pickle) для дальнейшего использования.[0063] Training the model (240), on average, takes 2-3 hours, depending on the use of the GPU NVIDIA 1080Ti. The model (240) with the maximum accuracy on deferred sampling is serialized into a binary format (pickle) for further use.

[0064] На Фиг. 5 представлена архитектура модели (240) определения релевантных реплик. Модель определения релевантности (240) предназначена для оценки релевантности данной пары контекст-ответ. Модель имеет два входных узла - для контекста диалога (2401) и для реплики - кандидата (2402). Модель имеет один выходной узел (2407), который представляет собой модуль для определения скорингового балла релевантности, который может принимать значения от 0 до 1.[0064] FIG. 5 shows the architecture of the model (240) for determining the relevant replicas. The relevance determination model (240) is designed to assess the relevance of a given context-response pair. The model has two input nodes - for the dialogue context (2401) and for the candidate replica (2402). The model has one output node (2407), which is a module for determining a relevance score, which can take values from 0 to 1.

[0065] Модуль (2403) векторизации слов аналогичен по своему функционалу модулю (2202), который также осуществляет векторизацию на уровне слов. Обозначение «Итеративный» означает, что модуль (2404) может выполнять предписанную обработку данных последовательно несколько раз. В данном случае параметр «Итеративный» для модуля векторизации слов (2404) указывает, что модуль векторизации слов (240) применяется по очереди для каждой реплики контекста (которых 3 штуки в данном примере). Для реплики кандидата это не требуется, так как она состоит из одного предложения (соответственно векторизация отрабатывает один раз).[0065] The word vectorization module (2403) is similar in functionality to the module (2202), which also performs vectorization at the word level. The notation "Iterative" means that the module (2404) can perform the prescribed data processing sequentially several times. In this case, the "Iterative" parameter for the word vectorization module (2404) indicates that the word vectorization module (240) is applied in turn for each replica of the context (of which there are 3 in this example). This is not required for the candidate's replica, since it consists of one sentence (accordingly, the vectorization is processed once).

[0066] Модуль (2405) предназначен для векторизации предложений и по своему функционалу повторяет функционал модуля (2203). На представленной на Фиг. 5 схеме узлы субдискретизации и конкатенации инкапсулированны внутрь модуля (2405) и не показаны на схеме явным образом. В данном случае параметр «Итеративный» указывает, что модуль векторизации предложений (2406) применяется по очереди для каждой реплики контекста (которых 3 штуки в данном примере).[0066] The module (2405) is intended for vectorization of sentences and repeats the functionality of the module (2203) in its functionality. Referring to FIG. 5 in the diagram, the subsampling and concatenation nodes are encapsulated inside the module (2405) and are not explicitly shown in the diagram. In this case, the "Iterative" parameter indicates that the sentence vectorization module (2406) is applied in turn for each replica of the context (of which there are 3 in this example).

[0067] Модуль (2407) представляет собой модуль вычисления релевантности. Данный модуль (2407) принимает на вход векторные представления контекста и кандидата и возвращает единственное число [0,1] - скоринговый балл релевантности. Скоринговый балл вычисляется на основании расчета ряда факторов, включающих в себя: скалярное произведение между вектором-кандидатом и каждым из векторов в контексте, скалярное произведение между вектором-кандидатом и суммой векторов контекста; конкатенации векторов контекста и вектора кандидата; вычисления скалярного произведения с вектором-кандидатом.[0067] Unit (2407) is a relevance computation unit. This module (2407) takes as input vector representations of the context and the candidate and returns a single number [0,1] - the relevance score. The score is calculated based on the calculation of a number of factors, including: the dot product between the candidate vector and each of the vectors in the context, the dot product between the candidate vector and the sum of the context vectors; concatenation of context vectors and candidate vector; calculating the dot product with a candidate vector.

[0068] Результат конкатенации векторов контекста и вектора кандидата подается на вход в двуслойный перцептрон. Размерность выходного слоя равна LSTM_DIM. На выходе формируется матрица (CTX_LEN, LSTM_DIM). Вычисляется скалярное произведение с вектором-кандидатом и в результате на выходе определяется CTX_LEN факторов для полученного контекста. Длина контекста обозначается как CTX_LEN и может быть от 1 (контекст - только вопрос) и до бесконечности (контекст - весь диалог). Типичные значения: [1:5]. В качестве примера реализации, при CTX_LEN=3 получается 7 факторов для вычисления релевантности.[0068] The result of the concatenation of the context vectors and the candidate vector is fed to the input to the two-layer perceptron. The dimension of the output layer is LSTM_DIM. The output forms a matrix (CTX_LEN, LSTM_DIM). The dot product with the candidate vector is computed, and as a result, the CTX_LEN factors for the resulting context are determined at the output. The length of the context is denoted as CTX_LEN and can be from 1 (context - just a question) to infinity (context - the whole dialogue). Typical values are [1: 5]. As an example implementation, with CTX_LEN = 3, 7 factors are obtained for calculating relevance.

[0069] Эти факторы подаются на вход еще одному двухслойному перцептрону с сигмоидальной функцией активации на последнем слое. Результатом работы модуля (2406) является одно число - скоринговый балл релевантности, являющийся финальным выходом всей модели (240). Модель (240) обучается как бинарный классификатор, предсказывая релевантен ли данный кандидат данному контексту, или нет, то есть, модель (240) позволяет определить ответ на получаемый вопрос в обращении.[0069] These factors are input to another two-layer perceptron with a sigmoidal activation function on the last layer. The output of the module (2406) is one number - the relevance score, which is the final output of the entire model (240). Model (240) is trained as a binary classifier, predicting whether a given candidate is relevant to a given context, or not, that is, model (240) allows you to determine the answer to the received question in circulation.

[0070] Обученная модель (240) может быть использована для построения вопросно-ответной системы следующим образом. Из всех возможных ответов оператора выделяется некоторое ограниченное множество кандидатов. В процессе выделения из кандидатов исключаются слишком короткие, слишком длинные, несодержательные и дубликатные ответы. Это процесс может быть, как полностью автоматизированным, так и полуавтоматизированным, при котором итоговый список кандидатов дополнительно проверяется вручную специалистом, что позволяет получить дополнительное качество работы всей системы. В результате получается множество кандидатов (обычно от сотен до тысяч), каждым из которых модель (240) сможет ответить на запрос.[0070] The trained model (240) can be used to construct a question-answer system as follows. Of all the possible answers of the operator, a certain limited set of candidates is distinguished. The selection process excludes too short, too long, meaningless and duplicate answers. This process can be either fully automated or semi-automated, in which the final list of candidates is additionally checked manually by a specialist, which makes it possible to obtain additional quality of the entire system. As a result, many candidates are obtained (usually from hundreds to thousands), each of which model (240) will be able to answer the request.

[0071] Для ответа на запрос моделью (240) оценивается релевантность контекста запроса каждому из загруженных кандидатов. Топ-К кандидатов (типичное значение k=3) возвращается в качестве наиболее вероятных вариантов ответа на запрос в обращении клиента.[0071] To respond to a query, model (240) evaluates the relevance of the query context to each of the loaded candidates. Top-K candidates (typically k = 3) are returned as the most likely response options for a customer request.

[0072] После получения обученной модели (240) определения релевантности реплик, данная модель (240) может использоваться в дальнейшем в автоматизированных системах анализа диалогов, поступающих со стороны клиента. Например, такими системами могут выступать чат-боты, интеллектуальные ассистенты, размещаемые на веб-сайтах, виджеты, телефонные роботы и т.п.[0072] After receiving the trained model (240) for determining the relevance of replicas, this model (240) can be used in the future in automated systems for analyzing dialogs coming from the client. For example, such systems can be chat bots, intelligent assistants placed on websites, widgets, telephone robots, etc.

[0073] На Фиг. 6 представлен пример применения обученной модели (240) определения релевантных реплик для обработки обращений клиента (10), которые могут поступать на ресурс (20) при обращении. Под ресурсом (20) может использоваться веб-ресурс (веб-сайт, портал и т.п.), кол-центр, мобильное приложение и т.п.Клиент (10) может сформировать свое обращение в виде телефонного звонка, посредством чат-сессии, VoIP звонке, использованию специализированного виджета или программного обеспечения и т.п.Ресурс (20) при получении информации обращения от клиента (10) передают контекст обращения в обученную модель (240), которая определяет вопросно-ответную пару и передает в модуль формирования ответа на обращение клиента (250) данные для генерирования ответа на вопрос клиента (10). Ответ на обращение клиента (10) передается от модуля (250), как правило, в том же информационном канале, из которого поступило обращение. Ответ может представлять собой ответ с помощью чат-бота, телефонного робота, интерактивная информация, гиперссылка или комбинация вариантов ответа и т.д.[0073] FIG. 6 shows an example of using the trained model (240) for determining the relevant replicas for processing client requests (10), which can be received by the resource (20) upon request. Under the resource (20), a web resource (website, portal, etc.), a call center, a mobile application, etc. can be used. The client (10) can form his appeal in the form of a phone call, via chat session, VoIP call, the use of a specialized widget or software, etc. Resource (20) upon receiving information from a client (10) transfers the context of the call to the trained model (240), which determines the question-answer pair and transfers it to the formation module response to a customer's request (250) data for generating an answer to a customer's question (10). The response to the client's request (10) is transmitted from the module (250), as a rule, in the same information channel from which the request came. The response can be a chatbot response, a telephone robot, interactive information, a hyperlink or a combination of response options, etc.

[0074] На Фиг. 7 представлен пример общего вида вычислительной системы (300), которая обеспечивает реализацию заявленного способа (100) или является частью компьютерной системы, например, сервером, персональным компьютером, частью вычислительного кластера, обрабатывающим необходимые данные для осуществления заявленного технического решения.[0074] FIG. 7 shows an example of a general view of a computing system (300) that implements the claimed method (100) or is part of a computer system, for example, a server, a personal computer, a part of a computing cluster that processes the necessary data to implement the claimed technical solution.

[0075] В общем случае, система (300) содержит объединенные общей шиной информационного обмена один или несколько процессоров (301), средства памяти, такие как ОЗУ (302) и ПЗУ (303), интерфейсы ввода/вывода (304), устройства ввода/вывода (305), и устройство для сетевого взаимодействия (306).[0075] In the General case, the system (300) contains one or more processors (301) united by a common bus of information exchange, memory means such as RAM (302) and ROM (303), input / output interfaces (304), input devices / output (305), and a device for networking (306).

[0076] Процессор (301) (или несколько процессоров, многоядерный процессор и т.п.) может выбираться из ассортимента устройств, широко применяемых в настоящее время, например, таких производителей, как: Intel™, AMD™, Apple™, Samsung Exynos™, MediaTEK™, Qualcomm Snapdragon™ и т.п. Под процессором или одним из используемых процессоров в системе (300) также необходимо учитывать графический процессор, например, GPU NVIDIA или Graphcore, тип которых также является пригодным для полного или частичного выполнения способа (100), а также может применяться для обучения и применения моделей машинного обучения в различных информационных системах.[0076] The processor (301) (or multiple processors, multi-core processor, etc.) can be selected from a range of devices currently widely used, for example, such manufacturers as: Intel ™, AMD ™, Apple ™, Samsung Exynos ™, MediaTEK ™, Qualcomm Snapdragon ™, etc. Under the processor or one of the processors used in the system (300), it is also necessary to take into account the graphics processor, for example, NVIDIA GPU or Graphcore, the type of which is also suitable for full or partial execution of the method (100), and can also be used for training and applying machine models. training in various information systems.

[0077] ОЗУ (302) представляет собой оперативную память и предназначено для хранения исполняемых процессором (301) машиночитаемых инструкций для выполнение необходимых операций по логической обработке данных. ОЗУ (302), как правило, содержит исполняемые инструкции операционной системы и соответствующих программных компонент (приложения, программные модули и т.п.). При этом, в качестве ОЗУ (302) может выступать доступный объем памяти графической карты или графического процессора.[0077] RAM (302) is a random access memory and is intended for storing machine-readable instructions executed by the processor (301) for performing the necessary operations for logical data processing. RAM (302) typically contains executable instructions of the operating system and associated software components (applications, software modules, etc.). In this case, the available memory of the graphics card or the graphics processor can act as RAM (302).

[0078] ПЗУ (303) представляет собой одно или более устройств постоянного хранения данных, например, жесткий диск (HDD), твердотельный накопитель данных (SSD), флэш-память (EEPROM, NAND и т.п.), оптические носители информации (CD-R/RW, DVD-R/RW, BlueRay Disc, MD) и др.[0078] ROM (303) is one or more persistent storage devices, such as a hard disk drive (HDD), solid state data storage device (SSD), flash memory (EEPROM, NAND, etc.), optical storage media ( CD-R / RW, DVD-R / RW, BlueRay Disc, MD), etc.

[0079] Для организации работы компонентов системы (300) и организации работы внешних подключаемых устройств применяются различные виды интерфейсов В/В (304). Выбор соответствующих интерфейсов зависит от конкретного исполнения вычислительного устройства, которые могут представлять собой, не ограничиваясь: PCI, AGP, PS/2, IrDa, FireWire, LPT, COM, SATA, IDE, Lightning, USB (2.0, 3.0, 3.1, micro, mini, type C), TRS/Audio jack (2.5, 3.5, 6.35), HDMI, DVI, VGA, Display Port, RJ45, RS232 и т.п.[0079] Various types of I / O interfaces (304) are used to organize the operation of system components (300) and to organize the operation of external connected devices. The choice of the appropriate interfaces depends on the specific version of the computing device, which can be, but are not limited to: PCI, AGP, PS / 2, IrDa, FireWire, LPT, COM, SATA, IDE, Lightning, USB (2.0, 3.0, 3.1, micro, mini, type C), TRS / Audio jack (2.5, 3.5, 6.35), HDMI, DVI, VGA, Display Port, RJ45, RS232, etc.

[0080] Для обеспечения взаимодействия пользователя с вычислительной системой (300) применяются различные средства (305) В/В информации, например, клавиатура, дисплей (монитор), сенсорный дисплей, тач-пад, джойстик, манипулятор мышь, световое перо, стилус, сенсорная панель, трекбол, динамики, микрофон, средства дополненной реальности, оптические сенсоры, планшет, световые индикаторы, проектор, камера, средства биометрической идентификации (сканер сетчатки глаза, сканер отпечатков пальцев, модуль распознавания голоса) и т.п.[0080] To ensure the interaction of the user with the computing system (300), various means (305) I / O information are used, for example, a keyboard, display (monitor), touch display, touch pad, joystick, mouse manipulator, light pen, stylus, touch panel, trackball, speakers, microphone, augmented reality, optical sensors, tablet, light indicators, projector, camera, biometric identification (retina scanner, fingerprint scanner, voice recognition module), etc.

[0081] Средство сетевого взаимодействия (306) обеспечивает передачу данных посредством внутренней или внешней вычислительной сети, например, Интранет, Интернет, ЛВС и т.п. В качестве одного или более средств (306) может использоваться, но не ограничиваться: Ethernet карта, GSM модем, GPRS модем, LTE модем, 5G модем, модуль спутниковой связи, NFC модуль, Bluetooth и/или BLE модуль, Wi-Fi модуль и др.[0081] The networking tool (306) provides data transmission via an internal or external computer network, for example, Intranet, Internet, LAN, and the like. One or more means (306) may be used, but not limited to: Ethernet card, GSM modem, GPRS modem, LTE modem, 5G modem, satellite communication module, NFC module, Bluetooth and / or BLE module, Wi-Fi module and dr.

[0082] Дополнительно могут применяться также средства спутниковой навигации в составе системы (300), например, GPS, ГЛОНАСС, BeiDou, Galileo.[0082] In addition, satellite navigation aids can be used as part of the system (300), for example, GPS, GLONASS, BeiDou, Galileo.

[0083] Как было представлено на Фиг. 6, ресурс (20), к которому осуществляет обращение клиент (10), может быть организован с помощью системы (300), которая может представлять собой сервер для обеспечения требуемого функционала по обработке поступающих обращений, распознаванию ответных реплик с помощью обученной модели (240) и генерирования ответных сообщений с помощью модуля (250), которые передаются по различным информационным каналам проводного и/или беспроводного типа.[0083] As shown in FIG. 6, the resource (20) accessed by the client (10) can be organized using the system (300), which can be a server to provide the required functionality for processing incoming calls, recognizing response replicas using a trained model (240) and generating response messages using the module (250), which are transmitted over various wired and / or wireless data channels.

[0084] Обращения клиентов (10) также могут формироваться с помощью устройства, которое содержит частичный функционал системы (300), в частности, устройство клиента (10) может представлять собой смартфон, компьютер, планшет, терминал и любое другой устройство, которое обеспечивает коммуникационный канал с ресурсом (20) для формирования и передачи обращения и получения требуемого ответа, который также может включать различный тип цифровой информации.[0084] Client requests (10) can also be generated using a device that contains partial functionality of the system (300), in particular, the client's device (10) can be a smartphone, computer, tablet, terminal and any other device that provides communication a channel with a resource (20) for generating and transmitting an appeal and receiving the required response, which can also include various types of digital information.

[0085] Таким образом, при применении модели определения релевантных ответов (240), созданной с помощью заявленного способа (100), достигается более точный подбор в автоматизированном режиме ответных пар по поступающему контексту в пользовательских обращениях, что позволяет создать новый, более усовершенствованный способ обучения и применения моделей машинного обучения в системах, основанных на использовании ИИ.[0085] Thus, when applying the model for determining the relevant responses (240), created using the claimed method (100), a more accurate selection in the automated mode of response pairs according to the incoming context in user calls is achieved, which allows you to create a new, more improved way of teaching and applying machine learning models to AI-powered systems.

[0086] Представленные материалы заявки раскрывают предпочтительные примеры реализации технического решения и не должны трактоваться как ограничивающие иные, частные примеры его воплощения, не выходящие за пределы испрашиваемой правовой охраны, которые являются очевидными для специалистов соответствующей области техники.[0086] The presented application materials disclose preferred examples of the implementation of the technical solution and should not be construed as limiting other, particular examples of its implementation, not going beyond the scope of the claimed legal protection, which are obvious to specialists in the relevant field of technology.

Claims

1. A computer-implemented method for creating a dialogue analysis model based on artificial intelligence for processing user requests, performed using at least one processor and containing the stages at which:

• receive a set of primary data, and the set includes at least text data of dialogs between users and operators, containing user requests and responses of operators;

• processing the obtained data set, during which a training sample is formed for an artificial neural network, containing positive and negative examples of user calls based on the analysis of the context of dialogues, and the positive examples contain a semantically related set of operator replicas in response to the user call;

• perform the selection and coding of vector representations of each replica from the positive and negative examples of the training sample mentioned in the previous step;

• use the generated training sample to train the model for determining the relevant replicas from the context of user calls in dialogues.

2. The method according to claim 1, characterized in that the model is at least one artificial neural network.

3. The method according to claim 1, characterized in that positive examples are formed on the basis of complete chains of dialogues between the operator and the user, and such a chain contains at least one interrogative sentence.

4. A method according to claim 1, characterized in that when selecting relevant replicas for responding to a user's request phrase at the stage of model training, a scoring score is calculated for each response replica.

5. The method according to claim 1, characterized in that at the stage of coding replicas into a vector representation, replicas representing sentences are encoded as a matrix of semantic vectors.

6. A system for processing user requests in an information channel using artificial intelligence, containing

- at least one processor;

- at least one memory coupled to the processor, which contains machine-readable instructions that, when executed by at least one processor, provide:

- receiving user requests using the information channel;

- processing of user requests using a machine learning model for automated processing of user requests, created using the method according to any one of paragraphs. 1-5;

- formation and transmission in the information channel of a response message to the user's request.

7. The system according to claim 6, characterized in that it is a server, mainframe or supercomputer.

8. The system according to claim 6, characterized in that the information channel is a chat session, VoIP communication or a telephone channel.

9. The system according to claim 6, characterized in that the chat session is a chat using a mobile application or a chat on a website.