[go: up one dir, main page]

WO2023278935A1 - Artificial intelligence based hotel demand model - Google Patents

Artificial intelligence based hotel demand model Download PDF

Info

Publication number
WO2023278935A1
WO2023278935A1 PCT/US2022/072854 US2022072854W WO2023278935A1 WO 2023278935 A1 WO2023278935 A1 WO 2023278935A1 US 2022072854 W US2022072854 W US 2022072854W WO 2023278935 A1 WO2023278935 A1 WO 2023278935A1
Authority
WO
WIPO (PCT)
Prior art keywords
hotel
room
models
mnl
customer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2022/072854
Other languages
French (fr)
Inventor
Sanghoon Cho
Andrew Vakhutinsky
Alan Wood
Jorge Luis Rivero Perez
Jean-Philippe Dumont
John Thomas Coulthurst
Denysse Diaz
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Oracle International Corp
Original Assignee
Oracle International Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/399,342 external-priority patent/US12014310B2/en
Application filed by Oracle International Corp filed Critical Oracle International Corp
Priority to CN202280043130.6A priority Critical patent/CN117501286A/en
Priority to CA3222594A priority patent/CA3222594A1/en
Priority to JP2023577727A priority patent/JP2024523377A/en
Publication of WO2023278935A1 publication Critical patent/WO2023278935A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/12Hotels or restaurants
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/02Reservations, e.g. for tickets, services or events
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management

Definitions

  • Embodiments generate a demand model for a potential hotel customer of a hotel room.
  • Embodiments based on features of the potential hotel customer, form a plurality of clusters, each cluster including a corresponding weight and cluster probabilities.
  • Embodiments generate an initial estimated mixture of multinomial logit (“MNL”) models corresponding to each of the plurality of clusters, the mixture of MNL models including a weighted likelihood function based on the features and the weights.
  • Embodiments estimate an updated estimated mixture of MNL models and maximize the weighted likelihood function based on the revised cluster probabilities and updated weights.
  • embodiments Based on the update weights and updated estimated mixture of MNL models, embodiments generate the demand model that is adapted to predict a choice probability of room categories and rate code combinations for the potential hotel customer.
  • Fig.1 is an overview block diagram of a hotel reservation system in accordance to embodiments of the invention.
  • Fig.2 is a block diagram of a computer server/system in accordance with an embodiment of the present invention.
  • Fig.3 is a flow diagram of the functionality of the room demand model module of Fig.2 for generating a room demand model in accordance with one embodiment.
  • Fig.4 illustrates an example of the initial clustering in accordance with embodiments.
  • Fig.5 is an example illustrating various offered prices, room categories and rate codes.
  • Fig.6 illustrates the choice modeling for guest clusters in accordance to example embodiments.
  • Fig.7 illustrates the initial assignment of an MNL model to each cluster in accordance to embodiments.
  • Fig.8 illustrates the proposed likelihood function used with the EM functionality in accordance to embodiments.
  • Fig.9 illustrates a portion of the EM functionality in accordance to embodiments.
  • Fig.10 illustrates a portion of the EM functionality in accordance to embodiments.
  • Figs.11-16 illustrate an example of embodiments of the invention for three clusters.
  • Fig.17 illustrates a comparison in prediction accuracy over iterations between CCR and MSE in accordance to embodiments of the invention.
  • Fig.18 illustrates how the cluster characteristics are changed over iterations given two clusters in accordance to embodiments.
  • Embodiments predict the choice of the hotel room category and associated service type by customers based on estimating parameters of discrete- choice models built on dynamically determined clusters of observations. Each observation corresponds to the choice made by a customer booking a room in the hotel and selecting an associated type of service from an ordered set of room categories and service type pairs offered at certain prices. Each room category and type of service is described by a set of features determining the customer’s value, or utility, of the choice. In addition, each customer is characterized by a set of their own attributes determining the cluster to which the customer belongs, also known as the “persona type.” It is assumed that each persona type may have its own utility of the booking choice.
  • the choice probability is modeled as a multinomial logit function based on the room-service pair utility for each persona type.
  • Embodiments increase the accuracy of the prediction and build a basis for the prescriptive analytics application to optimize the personalized offer by maximizing the expected revenue.
  • Embodiments can be used as a standalone system or as the central part of the personalized price optimization system for the personalized hotel rooms and the display optimization system for the order of the room category and rate code.
  • Embodiments utilize iteratively reconfigurable dynamic clustering based on a semi-parametric mixture of discrete choice models to fully reflect a customers' choice behavior instead of using a static clustering traditionally used for this purpose.
  • Embodiments address the need for a more accurate estimation of demand for hotel rooms by modeling the demand to account for heterogeneous customers with different: (1) Willingness-to-pay (indicated by the selected price range); (2) Rate plan selections (corporate discount, breakfast included, etc.); (3) Travel attributes; (4) Booking channels; (5) Booking windows; (6) Length of stay; (7) Date of arrival; and/or (8) Size of the group/family, number of children, etc.
  • the factors influencing the choice can include room features, rate plan features, price, and the order in which the offers are shown.
  • Fig.1 is an overview block diagram of a hotel reservation system 100 in accordance to embodiments of the invention.
  • Fig.1 includes booking channels 102 that a potential hotel customer may interact with to reserve a hotel room.
  • the channels include a Global Distribution System (“GDS”) 111, including “Amadeus”, “Sabre”, “Travel Port”, etc., Online Travel Agencies (“OTA”) 112, including “Booking.com”, “Expedia”, etc., Metasearch sites 113, and any other means for a customer to reserve a hotel room, including a website maintained by a hotel chain or individual hotel.
  • GDS Global Distribution System
  • OTA Online Travel Agencies
  • Metasearch sites 113 Metasearch sites 113
  • Each hotel chain operations 104 is accessed by an Application Programming Interface (“API”) 140 as a Web Service such as a “WebLogic Server” from Oracle Corp.
  • API Application Programming Interface
  • Hotel chain operations 104 includes a Hotel Property Management System (“PMS”) 121, such as “OPERA Cloud Property Management” from Oracle Corp., a Hotel Central Reservation System (“CRS”) 122, and a Demand Modeling module 150 that interfaces with systems 121 and 122 to provide optimized demand modeling as disclosed herein.
  • PMS Hotel Property Management System
  • CRS Hotel Central Reservation System
  • Demand Modeling module 150 that interfaces with systems 121 and 122 to provide optimized demand modeling as disclosed herein.
  • a hotel customer or potential hotel customer that uses system 100 to obtain a hotel room typically engages in a three stage booking process. First an area availability search is conducted. Multiple hotel chains are shown and hotel CRS 122 provides static data. The static data can include the min/max rate, available dates, etc.
  • the booking customer selects a hotel, they go to the next step which is the property search, including a single hotel property, multiple rooms and rate plans.
  • Fig.2 is a block diagram of a computer server/system 10 in accordance with an embodiment of the present invention. Although shown as a single system, the functionality of system 10 can be implemented as a distributed system. Further, the functionality disclosed herein can be implemented on separate servers or devices that may be coupled together over a network. Further, one or more components of system 10 may not be included.
  • System 10 when implemented as a web server or cloud based functionality, system 10 is implemented as one or more servers, and user interfaces such as displays, mouse, etc. are not needed. In embodiments, system 10 can be used to implement any of the elements shown in Fig.1.
  • System 10 includes a bus 12 or other communication mechanism for communicating information, and a processor 22 coupled to bus 12 for processing information. Processor 22 may be any type of general or specific purpose processor.
  • System 10 further includes a memory 14 for storing information and instructions to be executed by processor 22.
  • Memory 14 can be comprised of any combination of random access memory (“RAM”), read only memory (“ROM”), static storage such as a magnetic or optical disk, or any other type of computer readable media.
  • RAM random access memory
  • ROM read only memory
  • static storage such as a magnetic or optical disk, or any other type of computer readable media.
  • System 10 further includes a communication device 20, such as a network interface card, to provide access to a network. Therefore, a user may interface with system 10 directly, or remotely through a network, or any other method.
  • Computer readable media may be any available media that can be accessed by processor 22 and includes both volatile and nonvolatile media, removable and non-removable media, and communication media. Communication media may include computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and includes any information delivery media.
  • Processor 22 is further coupled via bus 12 to a display 24, such as a Liquid Crystal Display (“LCD”).
  • LCD Liquid Crystal Display
  • memory 14 stores software modules that provide functionality when executed by processor 22.
  • the modules include an operating system 15 that provides operating system functionality for system 10.
  • the modules further include room demand model module 16 that generates a room demand model to maximize the expected hotel room revenue, and all other functionality disclosed herein.
  • the expected revenue i.e., the product of the room booking probability and room price
  • System 10 can be part of a larger system.
  • system 10 can include one or more additional functional modules 18 to include the additional functionality, such as the functionality of a Property Management System (“PMS”) (e.g., the “Oracle Hospitality OPERA Property” or the “Oracle Hospitality OPERA Cloud Services”) or an enterprise resource planning (“ERP”) system.
  • PMS Property Management System
  • ERP enterprise resource planning
  • a database 17 is coupled to bus 12 to provide centralized storage for modules 16 and 18 and store guest data, hotel data, transactional data, etc.
  • database 17 is a relational database management system (“RDBMS”) that can use Structured Query Language (“SQL”) to manage the stored data.
  • RDBMS relational database management system
  • SQL Structured Query Language
  • database 17 is implemented as an in-memory database (“IMDB”).
  • IMDB is a database management system that primarily relies on main memory for computer data storage. It is contrasted with database management systems that employ a disk storage mechanism. Main memory databases are faster than disk-optimized databases because disk access is slower than memory access, the internal optimization algorithms are simpler and execute fewer CPU instructions. Accessing data in memory eliminates seek time when querying the data, which provides faster and more predictable performance than disk.
  • database 17, when implemented as a IMDB is implemented based on a distributed data grid.
  • a distributed data grid is a system in which a collection of computer servers work together in one or more clusters to manage information and related operations, such as computations, within a distributed or clustered environment.
  • a distributed data grid can be used to manage application objects and data that are shared across the servers.
  • a distributed data grid provides low response time, high throughput, predictable scalability, continuous availability, and information reliability.
  • distributed data grids such as, e.g., the “Oracle Coherence” data grid from Oracle Corp., store information in-memory to achieve higher performance, and employ redundancy in keeping copies of that information synchronized across multiple servers, thus ensuring resiliency of the system and continued availability of the data in the event of failure of a server.
  • system 10 is a computing/data processing system including an application or collection of distributed applications for enterprise organizations, and may also implement logistics, manufacturing, and inventory management functionality.
  • the applications and computing system 10 may be configured to operate with or be implemented as a cloud-based networking system, a software-as-a-service (“SaaS”) architecture, or other type of computing solution.
  • SaaS software-as-a-service
  • Embodiments solve the problem of predicting demand for multiple hotel room categories and service type combinations based on the hotel customer attributes, room category and service type features, offered price, and the order in which the room- rate pairs are presented to the customer.
  • Embodiments utilize a dynamic clustering approach to enable high- accuracy prediction of the room-service combination by a booking customer.
  • Embodiments start with an initial clustering to divide customers into several clusters so that the characteristics of customers within each cluster can be more homogeneous than those from other clusters, and assume a personalized choice model within each cluster. Since cluster membership of customers (i.e., which cluster each customer belongs to) is unobservable, embodiments employ a soft clustering approach, in which the "mix" is captured through a customer's probability of belonging to each cluster. [0037] To do so, embodiments implement unsupervised clustering using a random forest clustering algorithm with a certain number of clusters based on the characteristics of the potential hotel customers, orders of the room-service pairs, and their features, including the price offered.
  • embodiments derive a weighted likelihood function from the observed customers based on the discrete choice multinomial logit (“MNL”) models corresponding to the clusters with the weights set to the cluster probabilities obtained from the initial clustering. Then, embodiments maximize the weighted likelihood function to obtain the values of coefficients of each covariate and the intercept in the MNL models. Choice probabilities for multiple hotel room categories-service type combinations for each customer are calculated from those values. The number of clusters is selected to the value that delivers the best accuracy of the prediction. [0038] In embodiments, the initial clustering is based on the customer features, not their choices.
  • embodiments update the weights as the initial clustering probabilities multiplied by choice probabilities calculated at the previous step, which can be viewed as the E-step of the Expected-Maximization (“EM”) algorithm. Then, embodiments re-fit the models with the newly formed clusters performing the dynamic clustering step by maximizing the updated weighted likelihood function, which constitutes the M-step of the EM algorithm. Finally, embodiments reiterate this E-step and M-step until the convergence criterion is met. [0039] After the convergence, embodiments obtain the final estimates of the model parameters.
  • EM Expected-Maximization
  • embodiments can predict the choice probabilities for the new customer after estimating their association with each cluster by solving a classification problem employing the supervised Random Forest classifier.
  • Dynamic Iteratively Reconfigurable Clustering Algorithm/Functionality implement a dynamic iteratively reconfigurable clustering algorithm/functionality for predicting demand in order to generate a hotel room demand model. Assume that a customer population of interest consists of multiple clusters G, where (G > 1), where the pattern of booking a room is relatively homogeneous across customers within each cluster, while there is heterogeneity in booking patterns across clusters.
  • MNL as known as logit
  • the vector specifies how customer features affect in clustering, i.e., which cluster the customer belongs to.
  • the cluster membership indicator l i is unobservable, thus, the true structure of the mixing distribution is not known in practice and hard to test if the specified model is correct.
  • a pre-specified parametric family for the mixing distribution as in equation (1) may not be consistent with the true mixing distribution, referred to as model misspecification problem, which leads to biased parameter estimates or low goodness-of-fit measures, which affects prediction accuracy.
  • embodiments implement a semiparametric mixture of discrete choice models by assuming equations (1) and (2) rather than equations (1) and the MNL model (3).
  • embodiments use the same idea of EM algorithm as follows. Suppose that the latent clustering membership indicator l i is known.
  • the complete likelihood function is ⁇ a nd the complete log-likelihood function is [0044]
  • the maximizer of the objective function of equation (4) can be found by using the following iterative method: ⁇ [0045] Specifically, embodiments repeat the following E-step and M-step as follows. E-step [0046] Compute the conditional expectation of ⁇ ⁇ given the observed data ⁇ , , ⁇ ⁇ , , , , ⁇ ⁇ ( ⁇ ⁇ M-step [ 0047] Update the parameter ⁇ by solving the equation: ⁇ [0048] As disclosed, embodiments employ an unsupervised clustering technique, such as random forest, based on customer features.
  • the EM algorithm can be adjusted to a context, referred to as iterative reconfigurable clustering, as follows.
  • Fig.3 is a flow diagram of the functionality of room demand model module 16 of Fig.2 for generating a room demand model in accordance with one embodiment.
  • the functionality of the flow diagram of Fig.3 is implemented by software stored in memory or other computer readable or tangible medium, and executed by a processor.
  • the functionality may be performed by hardware (e.g., through the use of an application specific integrated circuit (“ASIC”), a programmable gate array (“PGA”), a field programmable gate array (“FPGA”), etc.), or any combination of hardware and software.
  • ASIC application specific integrated circuit
  • PGA programmable gate array
  • FPGA field programmable gate array
  • an initial unsupervised soft clustering is developed to cluster customers based on a plurality of attributes/characteristics assigned to each customer.
  • the attributes can include one or more of: (1) the global distribution system being used (e.g., Amadeus, SABRE, etc.); (2) the booking channel; (3) the number of nights; (4) the number of arriving customers; (5) booking advanced days; (6) weekend vs. weekday; (7) corporate booking.
  • the initial clustering at 302 is based on the customer features, not the customer choices.
  • Customer features are those features that are known at the time of the request for the room, and include such data as the arrival date and time, the number in the party, and the booking channel.
  • the customer feature data includes other inferred features such as the booking window (i.e., the time between the booking and arriving date).
  • the initial clustering as well as the dynamic clustering which is described below where the initial clustering and subsequent clustering is dynamically updated, both incorporate machine learning.
  • the initial clustering at 302 can incorporate any unsupervised machine learning techniques for clustering, such as random forest, or soft clustering algorithms using Gaussian mixture models.
  • the cluster membership is unobservable and therefore it is more challenging to assume a pre-specified parametric model about how clusters are formed based on the customer characteristics, and hard to test if the pre-specified parametric model is correct or not. Failure to specify a correct model leads to biased parameter estimates or low goodness-of-fit measures which affects prediction accuracy.
  • Fig.4 illustrates an example of the initial clustering in accordance with embodiments. As shown in Fig.4, based on the guest characteristics, external factors and travel attributes, three clusters are formed. In embodiments, the number of clusters is a predefined parameter based on the interpretability of the clustering, which typically limits the number of clusters to single digits. In various embodiments, two to four clusters are used.
  • embodiments estimate an initial mixture of Multinomial Logit (“MNL”) models for the demand for the hotel room categories and rate codes combinations based on parameters related to the hotel room offerings, including: (1) Offered prices; (2) Room category and rate plan position in the offer; and (3) Room and rate features such as view, room size, whether the breakfast included, free cancellation, etc.
  • MNL Multinomial Logit
  • Fig.5 is an example illustrating various offered prices (e.g., $335), room categories (e.g., deluxe or superior. king or queen bed) and rate codes (e.g., “Breakfast Included Rate”).
  • Fig.6 illustrates the choice modeling for guest clusters in accordance to example embodiments. As shown in Fig.6, each cluster uses a unique discrete choice model to predict the choice of the hotel room and rate code combination for each customer. Fig.7 illustrates the initial assignment of an MNL model to each cluster in accordance to embodiments. [0062] 306, 308 and 310, collectively and on an iterative basis, form an Expectation Maximization (“EM”) functionality.
  • EM Expectation Maximization
  • the EM functionality includes 306, 308, and 310, where it also contains the soft-clustering which is updated in E-step at 306.
  • Soft clustering at 302 is an initial clustering which is not repeated.
  • the cluster probabilities are updated by incorporating the choice probabilities of customers evaluated at the parameter values of the current iteration.
  • Fig.8 illustrates the proposed likelihood function used with the EM functionality in accordance to embodiments. As shown, the proposed likelihood function includes both the cluster model generated at 302 and shown in Fig.4, and the choice model generated at 304 and shown in Fig.6.
  • the proposed likelihood function is the objective function of the EM functionality.
  • Embodiments find a maximizer of this objective function to estimate the model parameters by using the EM functionality.
  • Fig.9 illustrates a portion of the EM functionality in accordance to embodiments.
  • the Expectation E-Step is determined at 306.
  • embodiments estimate an updated mixture of MNL models where the mixture probabilities are the updated cluster probabilities in the E-step.
  • 306 and 308 are repeated until the convergence criteria:
  • a demand model is generated that predicts the choice probabilities of room categories and rate code combination for a new customer.
  • the functionality ends.
  • the functionality of Fig.3 combines the estimation of discrete choice modeling with a data-driven identification of customer segments and captures varying preferences of a heterogeneous customer population and provides interpretable model outputs.
  • the demand model generated at 312 provides a practical approach that can help hoteliers profile their customers/guests based on their preferences, which can serve as a valuable input to: (1) formulate more efficient marketing policies and offer personalized recommendations that are more likely to be accepted; and (2) generate optimal personalized prices and display positions for each room type (e.g., suite with water view and queen bed).
  • Fig.10 illustrates a portion of the EM functionality in accordance to embodiments.
  • Fig.10 illustrates the M-step at 308 and the repeat until convergence at 310.
  • Figs.11-16 illustrate an example of embodiments of the invention for three clusters.
  • Fig.11 illustrates the soft clustering (302 of Fig.3) at 1101 and the choice modeling (304 of Fig.3) at 1102, where a different MNL model is generated for each of the clusters from the soft clustering.
  • the number of clusters is pre-determined before the EM functionality is used. To choose the best number of clusters, prediction accuracy measures are compared across several different numbers of clusters, and the best number achieving the most accurate prediction is selected.
  • Fig.12 illustrates the first iteration (306 and 308 of Fig.3) which uses the E-step to reassign the conditional cluster probability at 1201.
  • Fig.13 illustrates the first iteration (306 and 308 of Fig.3) which uses the M-step to update the choice model at 1301, which revises the conditional cluster probability.
  • Fig.14 illustrates the second iteration using the E-step the updated conditional cluster probability at 1301 and Fig.15 illustrates the second iteration using the M-step.
  • Fig.16 then illustrates the generation of the demand model using the estimated model parameters that forms a prediction of a choice probability of a new customer.
  • Metrics for Assessment [0074] To investigate the performance the iterative reconfigurable clustering in accordance to embodiments, embodiments divide the dataset into a training and a test dataset. After estimating the model parameters as well as initial clustering from the training data, embodiments obtain the predicted values of the product choices among the customers in the test data. For prediction accuracy measurements, embodiments use the correct classification ratio (“CCR”) and mean squared error (“MSE”). [0075] The CCR is calculated as the percentage of the observations where the option with highest predicted probability coincides with the observed choice.
  • CCR classification ratio
  • MSE mean squared error
  • the MSE is calculated as: ⁇ ⁇ where y i and ⁇ ⁇ e the true and predicted choices of customer i for room type j, ⁇ te is the index set of customers in the test data, and n te is the number of customers in the test data.
  • This metric is also referred to as the Brier score, which is commonly used in evaluating probabilistic predictions.
  • embodiments were applied using a real hotel dataset by utilizing the proprietary dataset of multiple hotels from multiple cities and countries. The data includes reservation information, and corresponding customer characteristics. Further, the log data (i.e., real time customers’ request for a reservation and the corresponding response by the reservation server system) is included. From this log, information was extracted on the display order of rooms and rate codes. This order is strategically inputted by each hotelier so that each customer has various display orders. The final dataset included 9,173 reservations from July 2, 2019 to July 19, 2019 with 18 different rooms and 15 various rate codes.
  • Embodiments first find the number of optimal clusters in the customer population, which is usually unknown.
  • Embodiments employ the prediction criteria approach to select the number of optimal clusters.
  • embodiments employ the MSE and choose the number of clusters which has the highest prediction accuracy among 2, 3, 4, and 5 clusters. Experiments determine that 2 clusters have the best performance among 4 options.
  • embodiments implement embodiments of the invention using the real hotel reservation dataset. Specifically, the prediction accuracy of embodiments is compared against a single cluster benchmark. Embodiments partition each dataset into training (80%) and test datasets (20%).
  • Fig.17 illustrates a comparison in prediction accuracy over iterations between CCR and MSE in accordance to embodiments of the invention.
  • lines 1701 and 1702 are for 2 clusters
  • lines 1703 and 1704 are for a single cluster. As shown, the prediction performance is improved over iterations for the 2 clusters using embodiments of the invention.
  • Fig.18 illustrates how the cluster characteristics are changed over iterations given two clusters in accordance to embodiments. Specifically, Fig.18 illustrates how the centroid values are moved over iterations, where curves 1850 are for the CCR and curves 1860 are for the MSE. Seven attributes are used to cluster customers to address the heterogeneous customer population: global distribution system (1803), booking channel (1802), number of nights (1806), number of arriving customers (1807), booking advanced days (1801), whether customers arrived in weekend (1805), and whether customers book through corporate code (1804).
  • Fig.18 shows how each attribute is moved for each cluster over iterations.
  • embodiments incorporate a novel approach to predicting the customer choice and estimating the relative values of the room categories and service type features in the hotel industry based on the booking customers’ attributes, orders of the room-service pairs in the offer, and offered price.
  • most of the demand- forecasting tools currently used by the hotel industry are aimed at providing the overall number of bookings based on a time series analysis assuming a single cluster (i.e., homogeneous customer population), thus ignoring heterogeneous customer populations.
  • These demand modeling tools are often ineffective in the presence of heterogeneous customers with significantly different willingness-to-pay and patterns of behavior.
  • Embodiments enable high-accuracy prediction of the room-service combination by a booking customer. Through the computational experiments, embodiments show that the prediction rate using a dynamic clustering approach achieves around 4% higher than the static clustering approach. Further, embodiments input the information on the order of the room category and rate code for the display optimization system, which can help hoteliers formulate more suitable marketing strategies and propose personalized recommendations that tend more to be accepted.
  • embodiments can incorporate any unsupervised machine learning techniques for clustering, such as random forest, or soft clustering algorithms using Gaussian mixture models, into the first step of the algorithm.
  • the cluster membership is unobservable, thus, it is more challenging to assume a pre-specified parametric model about how clusters are formed based on the customer characteristics, and hard to test if the pre-specified parametric model is correct or not. Failure to specify a correct model leads to biased parameter estimates or low goodness-of-fit measures which affects prediction accuracy. Since embodiments do not require any pre-specified parametric model form for clustering structure, possible biases from model miss-specification can be avoided.
  • Embodiments implement dynamic clustering as a form of machine learning, particularly when it involves training as with embodiments of the invention.
  • Embodiments use unsupervised learning, which takes a set of data that contains only inputs, and find structure in the data, such as grouping or clustering of data points.
  • Cluster analysis is the assignment of a set of observations into subsets, referred to as clusters, so that observations within the same cluster are similar according to one or more predesignated criteria, while observations drawn from different clusters are dissimilar.
  • Different clustering techniques make different assumptions on the structure of the data, often defined by some similarity metric and evaluated, for example, by internal compactness, or the similarity between members of the same cluster, and separation, the difference between clusters.
  • Dynamic clustering as a form of unsupervised online/incremental machine learning considers two concepts: (1) incrementality of the learning methods to devise the clustering model and (2) self- adaptation of the learned model (parameters and structure).
  • incrementality of the learning methods to devise the clustering model and (2) self- adaptation of the learned model (parameters and structure).

Landscapes

  • Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Physics & Mathematics (AREA)
  • Operations Research (AREA)
  • Development Economics (AREA)
  • Quality & Reliability (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Educational Administration (AREA)
  • Game Theory and Decision Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Embodiments generate a demand model for a potential hotel customer of a hotel room. Embodiments, based on features of the potential hotel customer, from a plurality of clusters, each cluster including a corresponding weight and cluster probabilities. Embodiments generate an initial estimated mixture of multinomial logit ("MNL") models corresponding to each of the plurality of clusters, the mixture of MNL models including a weighted likelihood function based on the features and the weights. Embodiments determine revised cluster probabilities and update the weights. Embodiments estimate an updated estimated mixture of MNL models and maximize the weighted likelihood function based on the revised cluster probabilities and updated weights. Based on the update weights and updated estimated mixture of MNL models, embodiments generate the demand model that is adapted to predict a choice probability of room categories and rate code combinations for the potential hotel customer.

Description

ARTIFICIAL INTELLIGENCE BASED HOTEL DEMAND MODEL CROSS REFERENCE TO RELATED APPLICATIONS [0001] This application claims priority of U.S. Provisional Patent Application Serial No.63/215,688, filed on June 28, 2021, the disclosure of which is hereby incorporated by reference. FIELD [0002] One embodiment is directed generally to a computer system, and in particular to a computer system that generates an artificial intelligence based hotel demand model. BACKGROUND INFORMATION [0003] Increased competition in the hotel industry has caused hoteliers to look for more innovative revenue management policies, such as personalized pricing and recommendations. Over the past few years, hoteliers have come to understand that not all guests are equal and a traditional one-size-fits-all policy might prove to be ineffective. Therefore, a need exists for hotels to profile their guests and offer them the right product/service at the right price with the goal of maximizing their profit. SUMMARY [0004] Embodiments generate a demand model for a potential hotel customer of a hotel room. Embodiments, based on features of the potential hotel customer, form a plurality of clusters, each cluster including a corresponding weight and cluster probabilities. Embodiments generate an initial estimated mixture of multinomial logit (“MNL”) models corresponding to each of the plurality of clusters, the mixture of MNL models including a weighted likelihood function based on the features and the weights. Embodiments estimate an updated estimated mixture of MNL models and maximize the weighted likelihood function based on the revised cluster probabilities and updated weights. Based on the update weights and updated estimated mixture of MNL models, embodiments generate the demand model that is adapted to predict a choice probability of room categories and rate code combinations for the potential hotel customer. BRIEF DESCRIPTION OF THE DRAWINGS [0005] Fig.1 is an overview block diagram of a hotel reservation system in accordance to embodiments of the invention. [0006] Fig.2 is a block diagram of a computer server/system in accordance with an embodiment of the present invention. [0007] Fig.3 is a flow diagram of the functionality of the room demand model module of Fig.2 for generating a room demand model in accordance with one embodiment. [0008] Fig.4 illustrates an example of the initial clustering in accordance with embodiments. [0009] Fig.5 is an example illustrating various offered prices, room categories and rate codes. [0010] Fig.6 illustrates the choice modeling for guest clusters in accordance to example embodiments. [0011] Fig.7 illustrates the initial assignment of an MNL model to each cluster in accordance to embodiments. [0012] Fig.8 illustrates the proposed likelihood function used with the EM functionality in accordance to embodiments. [0013] Fig.9 illustrates a portion of the EM functionality in accordance to embodiments. [0014] Fig.10 illustrates a portion of the EM functionality in accordance to embodiments. [0015] Figs.11-16 illustrate an example of embodiments of the invention for three clusters. [0016] Fig.17 illustrates a comparison in prediction accuracy over iterations between CCR and MSE in accordance to embodiments of the invention. [0017] Fig.18 illustrates how the cluster characteristics are changed over iterations given two clusters in accordance to embodiments. DETAILED DESCRIPTION [0018] Embodiments predict the choice of the hotel room category and associated service type by customers based on estimating parameters of discrete- choice models built on dynamically determined clusters of observations. Each observation corresponds to the choice made by a customer booking a room in the hotel and selecting an associated type of service from an ordered set of room categories and service type pairs offered at certain prices. Each room category and type of service is described by a set of features determining the customer’s value, or utility, of the choice. In addition, each customer is characterized by a set of their own attributes determining the cluster to which the customer belongs, also known as the “persona type.” It is assumed that each persona type may have its own utility of the booking choice. [0019] The choice probability is modeled as a multinomial logit function based on the room-service pair utility for each persona type. Embodiments increase the accuracy of the prediction and build a basis for the prescriptive analytics application to optimize the personalized offer by maximizing the expected revenue. Embodiments can be used as a standalone system or as the central part of the personalized price optimization system for the personalized hotel rooms and the display optimization system for the order of the room category and rate code. Embodiments utilize iteratively reconfigurable dynamic clustering based on a semi-parametric mixture of discrete choice models to fully reflect a customers' choice behavior instead of using a static clustering traditionally used for this purpose. [0020] In general, in the hotel industry, as well as other comparative industries, increased competition is driving more innovative revenue management practices such as personalized offers and pricing. Not all customers are the same, and a traditional one-size-fits-all policy might prove to be ineffective. Accurate estimation of demand as an input to a personalized recommendation system is crucial. [0021] Embodiments address the need for a more accurate estimation of demand for hotel rooms by modeling the demand to account for heterogeneous customers with different: (1) Willingness-to-pay (indicated by the selected price range); (2) Rate plan selections (corporate discount, breakfast included, etc.); (3) Travel attributes; (4) Booking channels; (5) Booking windows; (6) Length of stay; (7) Date of arrival; and/or (8) Size of the group/family, number of children, etc. The factors influencing the choice can include room features, rate plan features, price, and the order in which the offers are shown. [0022] Reference will now be made in detail to the embodiments of the present disclosure, examples of which are illustrated in the accompanying drawings. In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure. However, it will be apparent to one of ordinary skill in the art that the present disclosure may be practiced without these specific details. In other instances, well-known methods, procedures, components, and circuits have not been described in detail so as not to unnecessarily obscure aspects of the embodiments. Wherever possible, like reference numbers will be used for like elements. [0023] Fig.1 is an overview block diagram of a hotel reservation system 100 in accordance to embodiments of the invention. Fig.1 includes booking channels 102 that a potential hotel customer may interact with to reserve a hotel room. The channels include a Global Distribution System (“GDS”) 111, including “Amadeus”, “Sabre”, “Travel Port”, etc., Online Travel Agencies (“OTA”) 112, including “Booking.com”, “Expedia”, etc., Metasearch sites 113, and any other means for a customer to reserve a hotel room, including a website maintained by a hotel chain or individual hotel. [0024] Each hotel chain operations 104 is accessed by an Application Programming Interface (“API”) 140 as a Web Service such as a “WebLogic Server” from Oracle Corp. Hotel chain operations 104 includes a Hotel Property Management System (“PMS”) 121, such as “OPERA Cloud Property Management” from Oracle Corp., a Hotel Central Reservation System (“CRS”) 122, and a Demand Modeling module 150 that interfaces with systems 121 and 122 to provide optimized demand modeling as disclosed herein. [0025] A hotel customer or potential hotel customer that uses system 100 to obtain a hotel room typically engages in a three stage booking process. First an area availability search is conducted. Multiple hotel chains are shown and hotel CRS 122 provides static data. The static data can include the min/max rate, available dates, etc. [0026] If the booking customer selects a hotel, they go to the next step which is the property search, including a single hotel property, multiple rooms and rate plans. For the single hotel property, information may include room category description data, rate plan description and room price, each of which is shown in a specific order. The property search includes real-time availability data and results in the booking customer selecting a room. Once the room is selected, the final step is final booking and the reservation being guaranteed by a credit card or other form of payment. [0027] Fig.2 is a block diagram of a computer server/system 10 in accordance with an embodiment of the present invention. Although shown as a single system, the functionality of system 10 can be implemented as a distributed system. Further, the functionality disclosed herein can be implemented on separate servers or devices that may be coupled together over a network. Further, one or more components of system 10 may not be included. For example, when implemented as a web server or cloud based functionality, system 10 is implemented as one or more servers, and user interfaces such as displays, mouse, etc. are not needed. In embodiments, system 10 can be used to implement any of the elements shown in Fig.1. [0028] System 10 includes a bus 12 or other communication mechanism for communicating information, and a processor 22 coupled to bus 12 for processing information. Processor 22 may be any type of general or specific purpose processor. System 10 further includes a memory 14 for storing information and instructions to be executed by processor 22. Memory 14 can be comprised of any combination of random access memory (“RAM”), read only memory (“ROM”), static storage such as a magnetic or optical disk, or any other type of computer readable media. System 10 further includes a communication device 20, such as a network interface card, to provide access to a network. Therefore, a user may interface with system 10 directly, or remotely through a network, or any other method. [0029] Computer readable media may be any available media that can be accessed by processor 22 and includes both volatile and nonvolatile media, removable and non-removable media, and communication media. Communication media may include computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and includes any information delivery media. [0030] Processor 22 is further coupled via bus 12 to a display 24, such as a Liquid Crystal Display (“LCD”). A keyboard 26 and a cursor control device 28, such as a computer mouse, are further coupled to bus 12 to enable a user to interface with system 10. [0031] In one embodiment, memory 14 stores software modules that provide functionality when executed by processor 22. The modules include an operating system 15 that provides operating system functionality for system 10. The modules further include room demand model module 16 that generates a room demand model to maximize the expected hotel room revenue, and all other functionality disclosed herein. As a hotel variable operating cost is relatively small, the expected revenue (i.e., the product of the room booking probability and room price) is the main optimization objective in embodiments. System 10 can be part of a larger system. Therefore, system 10 can include one or more additional functional modules 18 to include the additional functionality, such as the functionality of a Property Management System (“PMS”) (e.g., the “Oracle Hospitality OPERA Property” or the “Oracle Hospitality OPERA Cloud Services”) or an enterprise resource planning (“ERP”) system. A database 17 is coupled to bus 12 to provide centralized storage for modules 16 and 18 and store guest data, hotel data, transactional data, etc. In one embodiment, database 17 is a relational database management system (“RDBMS”) that can use Structured Query Language (“SQL”) to manage the stored data. [0032] In one embodiment, particularly when there are a large number of hotel locations, a large number of guests, and a large amount of historical data, database 17 is implemented as an in-memory database (“IMDB”). An IMDB is a database management system that primarily relies on main memory for computer data storage. It is contrasted with database management systems that employ a disk storage mechanism. Main memory databases are faster than disk-optimized databases because disk access is slower than memory access, the internal optimization algorithms are simpler and execute fewer CPU instructions. Accessing data in memory eliminates seek time when querying the data, which provides faster and more predictable performance than disk. [0033] In one embodiment, database 17, when implemented as a IMDB, is implemented based on a distributed data grid. A distributed data grid is a system in which a collection of computer servers work together in one or more clusters to manage information and related operations, such as computations, within a distributed or clustered environment. A distributed data grid can be used to manage application objects and data that are shared across the servers. A distributed data grid provides low response time, high throughput, predictable scalability, continuous availability, and information reliability. In particular examples, distributed data grids, such as, e.g., the “Oracle Coherence” data grid from Oracle Corp., store information in-memory to achieve higher performance, and employ redundancy in keeping copies of that information synchronized across multiple servers, thus ensuring resiliency of the system and continued availability of the data in the event of failure of a server. [0034] In one embodiment, system 10 is a computing/data processing system including an application or collection of distributed applications for enterprise organizations, and may also implement logistics, manufacturing, and inventory management functionality. The applications and computing system 10 may be configured to operate with or be implemented as a cloud-based networking system, a software-as-a-service (“SaaS”) architecture, or other type of computing solution. [0035] Embodiments solve the problem of predicting demand for multiple hotel room categories and service type combinations based on the hotel customer attributes, room category and service type features, offered price, and the order in which the room- rate pairs are presented to the customer. Rather than assuming homogeneous characteristics of the customers (i.e., where the expected demand should be the same when the same prices are offered), embodiments assume that the customer population includes several clusters to allow for customer characteristics and choice patterns to be heterogeneous across the clusters. In addition to predicting the demand of these heterogeneous customers (i.e., where the expected demand could be different, even when the same prices are offered), embodiments estimate the dynamic size of each cluster and centroid of each cluster recomputed over iterations to reflect the new assignments. The principal output of the problem is the probability of each individual customer booking a room in a specific room categories-service type combination. [0036] Embodiments utilize a dynamic clustering approach to enable high- accuracy prediction of the room-service combination by a booking customer. Embodiments start with an initial clustering to divide customers into several clusters so that the characteristics of customers within each cluster can be more homogeneous than those from other clusters, and assume a personalized choice model within each cluster. Since cluster membership of customers (i.e., which cluster each customer belongs to) is unobservable, embodiments employ a soft clustering approach, in which the "mix" is captured through a customer's probability of belonging to each cluster. [0037] To do so, embodiments implement unsupervised clustering using a random forest clustering algorithm with a certain number of clusters based on the characteristics of the potential hotel customers, orders of the room-service pairs, and their features, including the price offered. Next, embodiments derive a weighted likelihood function from the observed customers based on the discrete choice multinomial logit (“MNL”) models corresponding to the clusters with the weights set to the cluster probabilities obtained from the initial clustering. Then, embodiments maximize the weighted likelihood function to obtain the values of coefficients of each covariate and the intercept in the MNL models. Choice probabilities for multiple hotel room categories-service type combinations for each customer are calculated from those values. The number of clusters is selected to the value that delivers the best accuracy of the prediction. [0038] In embodiments, the initial clustering is based on the customer features, not their choices. To incorporate the choice behavior of customers into clustering, embodiments update the weights as the initial clustering probabilities multiplied by choice probabilities calculated at the previous step, which can be viewed as the E-step of the Expected-Maximization (“EM”) algorithm. Then, embodiments re-fit the models with the newly formed clusters performing the dynamic clustering step by maximizing the updated weighted likelihood function, which constitutes the M-step of the EM algorithm. Finally, embodiments reiterate this E-step and M-step until the convergence criterion is met. [0039] After the convergence, embodiments obtain the final estimates of the model parameters. For a new customer with their own characteristics, orders of the room-service pairs, and room category features, including the price offered, embodiments can predict the choice probabilities for the new customer after estimating their association with each cluster by solving a classification problem employing the supervised Random Forest classifier. Dynamic Iteratively Reconfigurable Clustering Algorithm/Functionality [0040] In general, embodiments implement a dynamic iteratively reconfigurable clustering algorithm/functionality for predicting demand in order to generate a hotel room demand model. Assume that a customer population of interest consists of multiple clusters G, where (G > 1), where the pattern of booking a room is relatively homogeneous across customers within each cluster, while there is heterogeneity in booking patterns across clusters. Under this assumption, it is intuitive to consider different G choice models across clusters, i.e., a choice model fitted to each cluster separately. In practice, however, cluster membership indicating which cluster each customer belongs to is unobservable. In contrast, embodiments implement a novel algorithm/functionality to deal with the issue of estimating heterogeneous booking patterns of customers across clusters when cluster membership is unknown. [0041] Specifically, assume that customer ^ is characterized by a set of observable covariates ^^^, f
Figure imgf000011_0001
i = 1, … , n, where n is the number of customers in a data set. Let J be the number of products considered in a market, and Si be the set of available products to customer i, i.e., Si ⊂ {1, ...,J}. Let yi denote the product choice made by customer i, where yi ∈ Si. Product j = 1,...,J is characterized by a set of observable variables . Then, the MNL room selection probability within cluster g can
Figure imgf000012_0004
be expressed as follows. Letting li be a cluster membership indicator for customer i,
Figure imgf000012_0001
where for identifiability and B ∈ {1,...,J} is the baseline product. In
Figure imgf000012_0005
embodiments, B = J is set to demonstrate embodiments of the invention. Since the cluster membership indicator is unobservable, this is regarded as a latent variable and a model is needed to explain different probabilities of belonging to a cluster across different customer features. Specifically, assume a model for li , called mixing distribution, as follows:
Figure imgf000012_0002
where
Figure imgf000012_0006
is a generic notation for a probability mass function which depends on
Figure imgf000012_0007
b which is unknown.
[0042] One general approach to model is to assume an MNL (as known as logit) model, which specifies that a customer belongs to cluster g with probability
Figure imgf000012_0003
where embodiments set 4
Figure imgf000012_0008
for identifiability. The vector
Figure imgf000012_0009
specifies how customer features affect in clustering, i.e., which cluster the customer belongs to. However, unlike the product choice yi the cluster membership indicator li is unobservable, thus, the true structure of the mixing distribution is not known in practice and hard to test if the specified model is correct. A pre-specified parametric family for the mixing distribution as in equation (1) may not be consistent with the true mixing distribution, referred to as model misspecification problem, which leads to biased parameter estimates or low goodness-of-fit measures, which affects prediction accuracy. [0043] To avoid such a model misspecification and improve a prediction performance, embodiments implement a semiparametric mixture of discrete choice models by assuming equations (1) and (2) rather than equations (1) and the MNL model (3). Letting the model parameters denote ^^ = {^^ ^
Figure imgf000013_0001
the likelihood function of ^ ^^^^ is
Figure imgf000013_0002
ritten as
Figure imgf000013_0003
where there is not imposed any pre-specified parametric model form for ^( κ^ = ^ פ ^^
Figure imgf000013_0005
which can be estimated by using any nonparametric clustering method such as random forest. Other unsupervised machine learning techniques for clustering can also be used in other embodiments. Then, embodiments use the same idea of EM algorithm as follows. Suppose that the latent clustering membership indicator li is known. Then, the complete likelihood function is ^
Figure imgf000013_0004
and the complete log-likelihood function is
Figure imgf000013_0006
[0044] In the EM algorithm, the maximizer of the objective function of equation (4) can be found by using the following iterative method: ^^
Figure imgf000014_0001
[0045] Specifically, embodiments repeat the following E-step and M-step as follows. E-step [0046] Compute the conditional expectation of κ^ given the observed data {
Figure imgf000014_0003
, , } { , , , , , } ^(௧ ^^
Figure imgf000014_0002
M-step [0047] Update the parameter ^^ by solving the equation: ^
Figure imgf000014_0004
[0048] As disclosed, embodiments employ an unsupervised clustering technique, such as random forest, based on customer features. Thus, the EM algorithm can be adjusted to a context, referred to as iterative reconfigurable clustering, as follows. Initial clustering: [0049] Perform learning an unsupervised soft clustering (e.g., random forest, k- means) with G. Clusters based on the customer-level covariates ^^^^^: ^ = 1,
Figure imgf000014_0005
, obtain the clustering probabilities for each cluster, ^(^) ^^ , … , ^
Figure imgf000014_0006
ீ at here ^(^) he initial es ( ) (^)
Figure imgf000014_0007
^^
Figure imgf000014_0008
timate of ^ κ^ = ^
Figure imgf000014_0009
, … , ^^ீ
Figure imgf000014_0010
- the initial parameter values by solving the equations: ^ ^ ^ୀ^
Figure imgf000015_0001
E-step [0050] Embodiments determine the conditional cluster probabilities by using the observed choice and the fitted discrete choice model as follows: if yi = j, ^(௧) ^^
Figure imgf000015_0002
such that σ ^ீ
Figure imgf000015_0003
M-step [0051] Update the choice model parameters to ^^ ^(௧ା^) a ^)
Figure imgf000015_0005
by
Figure imgf000015_0006
the following equations. For each ^, first find the solutions of ^^ ^ for j = 1, ...,J-1 from, the
Figure imgf000015_0007
equation as follows. ^ ^ ^ୀ^
Figure imgf000015_0004
Then obtain ^^(௧ ^
Figure imgf000015_0008
solving the equation with respect to ^^^:
Figure imgf000015_0009
^ ^ ^ୀ^
Figure imgf000015_0010
where g = 2, … , G. [0052] Repeat the (E-step) and the (M-step) until the convergence criterion as
Figure imgf000016_0002
|| || met for any ∈ > 0. [0053] The above can be viewed as a variant of the EM algorithm. In embodiments, the “Theorem of Dempster et al. (1977)” can be applied to the proposed iterative algorithm, which says that the solution {^^(௧)} co
Figure imgf000016_0003
es to ^ , where
Figure imgf000016_0004
is the
Figure imgf000016_0005
maximizer of our objective function, ^(^^).
Figure imgf000016_0006
Prediction Of Room Categories And Rate Codes Combination [0054] After convergence, embodiments obtain the final estimate of the model parameters {^ , , , ; , , , } כ
Figure imgf000016_0007
new customer characterized by ^^ ,
Figure imgf000016_0008
embodiments predict the choice probabilities as follows: letting ^כ be the available products, for j א
Figure imgf000016_0009
, ∈ ^( ^
Figure imgf000016_0001
where ^^ s the predicted probability of belonging to cluster ^ by the soft clustering, and s a feature vector of room ^ available to the new customer. [0055] Fig.3 is a flow diagram of the functionality of room demand model module 16 of Fig.2 for generating a room demand model in accordance with one embodiment. In one embodiment, the functionality of the flow diagram of Fig.3 is implemented by software stored in memory or other computer readable or tangible medium, and executed by a processor. In other embodiments, the functionality may be performed by hardware (e.g., through the use of an application specific integrated circuit (“ASIC”), a programmable gate array (“PGA”), a field programmable gate array (“FPGA”), etc.), or any combination of hardware and software. [0056] At 302, an initial unsupervised soft clustering is developed to cluster customers based on a plurality of attributes/characteristics assigned to each customer. In embodiments, the attributes can include one or more of: (1) the global distribution system being used (e.g., Amadeus, SABRE, etc.); (2) the booking channel; (3) the number of nights; (4) the number of arriving customers; (5) booking advanced days; (6) weekend vs. weekday; (7) corporate booking. [0057] The initial clustering at 302 is based on the customer features, not the customer choices. Customer features are those features that are known at the time of the request for the room, and include such data as the arrival date and time, the number in the party, and the booking channel. In addition, the customer feature data includes other inferred features such as the booking window (i.e., the time between the booking and arriving date). [0058] The initial clustering, as well as the dynamic clustering which is described below where the initial clustering and subsequent clustering is dynamically updated, both incorporate machine learning. Specifically, the initial clustering at 302 can incorporate any unsupervised machine learning techniques for clustering, such as random forest, or soft clustering algorithms using Gaussian mixture models. Unlike with the choice of customers, the cluster membership is unobservable and therefore it is more challenging to assume a pre-specified parametric model about how clusters are formed based on the customer characteristics, and hard to test if the pre-specified parametric model is correct or not. Failure to specify a correct model leads to biased parameter estimates or low goodness-of-fit measures which affects prediction accuracy. Since embodiments do not require any pre-specified parametric model form for clustering structure, possible biases from model mis-specification can be avoided. [0059] Fig.4 illustrates an example of the initial clustering in accordance with embodiments. As shown in Fig.4, based on the guest characteristics, external factors and travel attributes, three clusters are formed. In embodiments, the number of clusters is a predefined parameter based on the interpretability of the clustering, which typically limits the number of clusters to single digits. In various embodiments, two to four clusters are used. [0060] At 304, embodiments estimate an initial mixture of Multinomial Logit (“MNL”) models for the demand for the hotel room categories and rate codes combinations based on parameters related to the hotel room offerings, including: (1) Offered prices; (2) Room category and rate plan position in the offer; and (3) Room and rate features such as view, room size, whether the breakfast included, free cancellation, etc. For each cluster formed at 302, a separate MNL model is built at 304. Fig.5 is an example illustrating various offered prices (e.g., $335), room categories (e.g., deluxe or superior. king or queen bed) and rate codes (e.g., “Breakfast Included Rate”). At 304, the historical booking data stored in a database (e.g., database 17 of Fig.2) in order to estimate the ^^ parameter defined in conjunction with equation (4) above on by solving equation (5) above. [0061] Fig.6 illustrates the choice modeling for guest clusters in accordance to example embodiments. As shown in Fig.6, each cluster uses a unique discrete choice model to predict the choice of the hotel room and rate code combination for each customer. Fig.7 illustrates the initial assignment of an MNL model to each cluster in accordance to embodiments. [0062] 306, 308 and 310, collectively and on an iterative basis, form an Expectation Maximization (“EM”) functionality. The EM functionality includes 306, 308, and 310, where it also contains the soft-clustering which is updated in E-step at 306. Soft clustering at 302 is an initial clustering which is not repeated. At 306, for the Expectation “E-step”, the cluster probabilities are updated by incorporating the choice probabilities of customers evaluated at the parameter values of the current iteration. [0063] Fig.8 illustrates the proposed likelihood function used with the EM functionality in accordance to embodiments. As shown, the proposed likelihood function includes both the cluster model generated at 302 and shown in Fig.4, and the choice model generated at 304 and shown in Fig.6. The proposed likelihood function is the objective function of the EM functionality. Embodiments find a maximizer of this objective function to estimate the model parameters by using the EM functionality. [0064] Fig.9 illustrates a portion of the EM functionality in accordance to embodiments. After the “Initial Step”, which is the soft clustering performed at 302, the Expectation E-Step is determined at 306. [0065] At 308 for the Maximization “M-step”, embodiments estimate an updated mixture of MNL models where the mixture probabilities are the updated cluster probabilities in the E-step. At 310, 306 and 308 are repeated until the convergence criteria: |New Prediction Error - Old Prediction Error| < 0.0001, is met. [0066] At 312, using estimated parameters from 306, 308, a demand model is generated that predicts the choice probabilities of room categories and rate code combination for a new customer. At 314, the functionality ends. [0067] The functionality of Fig.3 combines the estimation of discrete choice modeling with a data-driven identification of customer segments and captures varying preferences of a heterogeneous customer population and provides interpretable model outputs. The demand model generated at 312 provides a practical approach that can help hoteliers profile their customers/guests based on their preferences, which can serve as a valuable input to: (1) formulate more efficient marketing policies and offer personalized recommendations that are more likely to be accepted; and (2) generate optimal personalized prices and display positions for each room type (e.g., suite with water view and queen bed). [0068] Fig.10 illustrates a portion of the EM functionality in accordance to embodiments. Fig.10 illustrates the M-step at 308 and the repeat until convergence at 310. [0069] Figs.11-16 illustrate an example of embodiments of the invention for three clusters. Fig.11 illustrates the soft clustering (302 of Fig.3) at 1101 and the choice modeling (304 of Fig.3) at 1102, where a different MNL model is generated for each of the clusters from the soft clustering. The number of clusters is pre-determined before the EM functionality is used. To choose the best number of clusters, prediction accuracy measures are compared across several different numbers of clusters, and the best number achieving the most accurate prediction is selected. There is a different MNL for each cluster but all of the model parameters are jointly estimated. The initial data 1103 for each guest is used as input and an initial cluster probability 1104 is generated for each guest. [0070] Fig.12 illustrates the first iteration (306 and 308 of Fig.3) which uses the E-step to reassign the conditional cluster probability at 1201. [0071] Fig.13 illustrates the first iteration (306 and 308 of Fig.3) which uses the M-step to update the choice model at 1301, which revises the conditional cluster probability. [0072] Fig.14 illustrates the second iteration using the E-step the updated conditional cluster probability at 1301 and Fig.15 illustrates the second iteration using the M-step. For purposes of illustration, assume only two iterations. [0073] Fig.16 then illustrates the generation of the demand model using the estimated model parameters that forms a prediction of a choice probability of a new customer. Metrics for Assessment [0074] To investigate the performance the iterative reconfigurable clustering in accordance to embodiments, embodiments divide the dataset into a training and a test dataset. After estimating the model parameters as well as initial clustering from the training data, embodiments obtain the predicted values of the product choices among the customers in the test data. For prediction accuracy measurements, embodiments use the correct classification ratio (“CCR”) and mean squared error (“MSE”). [0075] The CCR is calculated as the percentage of the observations where the option with highest predicted probability coincides with the observed choice. [0076] The MSE is calculated as: ^^^
Figure imgf000020_0001
where yi and ^Ƹ^ e the true and predicted choices of customer i for room type j, Φte is the index set of customers in the test data, and nte is the number of customers in the test data. This metric is also referred to as the Brier score, which is commonly used in evaluating probabilistic predictions. [0077] In experiments, embodiments were applied using a real hotel dataset by utilizing the proprietary dataset of multiple hotels from multiple cities and countries. The data includes reservation information, and corresponding customer characteristics. Further, the log data (i.e., real time customers’ request for a reservation and the corresponding response by the reservation server system) is included. From this log, information was extracted on the display order of rooms and rate codes. This order is strategically inputted by each hotelier so that each customer has various display orders. The final dataset included 9,173 reservations from July 2, 2019 to July 19, 2019 with 18 different rooms and 15 various rate codes.
[0078] Embodiments first find the number of optimal clusters in the customer population, which is usually unknown. Embodiments employ the prediction criteria approach to select the number of optimal clusters. In particular, embodiments employ the MSE and choose the number of clusters which has the highest prediction accuracy among 2, 3, 4, and 5 clusters. Experiments determine that 2 clusters have the best performance among 4 options. [0079] Given 2 clusters, embodiments implement embodiments of the invention using the real hotel reservation dataset. Specifically, the prediction accuracy of embodiments is compared against a single cluster benchmark. Embodiments partition each dataset into training (80%) and test datasets (20%). The results are presented below in Table 1 where the iterations are stopped at 17 since the criteria based on MSE is met (i.e., close to 0.0001): Table 1. Comparison of prediction accuracy over iterations using MSE and CCR Itera 0 1 2 3 4 5 6 7 8 9 1 1
Figure imgf000021_0001
- 19 - 1 1 1 1 1 1
Figure imgf000022_0001
[0080] Fig.17 illustrates a comparison in prediction accuracy over iterations between CCR and MSE in accordance to embodiments of the invention. In Fig.11, lines 1701 and 1702 are for 2 clusters, and lines 1703 and 1704 are for a single cluster. As shown, the prediction performance is improved over iterations for the 2 clusters using embodiments of the invention. Specifically, while the value of CCR is getting increased, the value of MSE is getting decreased over iterations. It is also observed that the result of single cluster (i.e., known solutions) is worse than with the 2 clusters. [0081] Fig.18 illustrates how the cluster characteristics are changed over iterations given two clusters in accordance to embodiments. Specifically, Fig.18 illustrates how the centroid values are moved over iterations, where curves 1850 are for the CCR and curves 1860 are for the MSE. Seven attributes are used to cluster customers to address the heterogeneous customer population: global distribution system (1803), booking channel (1802), number of nights (1806), number of arriving customers (1807), booking advanced days (1801), whether customers arrived in weekend (1805), and whether customers book through corporate code (1804). Fig.18 shows how each attribute is moved for each cluster over iterations. [0082] As disclosed, embodiments incorporate a novel approach to predicting the customer choice and estimating the relative values of the room categories and service type features in the hotel industry based on the booking customers’ attributes, orders of the room-service pairs in the offer, and offered price. Specifically, most of the demand- forecasting tools currently used by the hotel industry are aimed at providing the overall number of bookings based on a time series analysis assuming a single cluster (i.e., homogeneous customer population), thus ignoring heterogeneous customer populations. These demand modeling tools are often ineffective in the presence of heterogeneous customers with significantly different willingness-to-pay and patterns of behavior. Even if a few tools consider heterogeneous customer population, they employ a standard cluster algorithm which may not reflect the customer choice behaviors during the clustering process. Moreover, in general, no demand forecasting tools address the order of the room category-rate code pairs. The order of display on the website affects the customer's choice behavior in addition to the offered price. [0083] Embodiments enable high-accuracy prediction of the room-service combination by a booking customer. Through the computational experiments, embodiments show that the prediction rate using a dynamic clustering approach achieves around 4% higher than the static clustering approach. Further, embodiments input the information on the order of the room category and rate code for the display optimization system, which can help hoteliers formulate more suitable marketing strategies and propose personalized recommendations that tend more to be accepted. [0084] In addition, embodiments can incorporate any unsupervised machine learning techniques for clustering, such as random forest, or soft clustering algorithms using Gaussian mixture models, into the first step of the algorithm. Unlike the choice of customers, the cluster membership is unobservable, thus, it is more challenging to assume a pre-specified parametric model about how clusters are formed based on the customer characteristics, and hard to test if the pre-specified parametric model is correct or not. Failure to specify a correct model leads to biased parameter estimates or low goodness-of-fit measures which affects prediction accuracy. Since embodiments do not require any pre-specified parametric model form for clustering structure, possible biases from model miss-specification can be avoided. [0085] Embodiments implement dynamic clustering as a form of machine learning, particularly when it involves training as with embodiments of the invention. Embodiments use unsupervised learning, which takes a set of data that contains only inputs, and find structure in the data, such as grouping or clustering of data points. Cluster analysis is the assignment of a set of observations into subsets, referred to as clusters, so that observations within the same cluster are similar according to one or more predesignated criteria, while observations drawn from different clusters are dissimilar. Different clustering techniques make different assumptions on the structure of the data, often defined by some similarity metric and evaluated, for example, by internal compactness, or the similarity between members of the same cluster, and separation, the difference between clusters. Dynamic clustering as a form of unsupervised online/incremental machine learning considers two concepts: (1) incrementality of the learning methods to devise the clustering model and (2) self- adaptation of the learned model (parameters and structure). [0086] The features, structures, or characteristics of the disclosure described throughout this specification may be combined in any suitable manner in one or more embodiments. For example, the usage of “one embodiment,” “some embodiments,” “certain embodiment,” “certain embodiments,” or other similar language, throughout this specification refers to the fact that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present disclosure. Thus, appearances of the phrases “one embodiment,” “some embodiments,” “a certain embodiment,” “certain embodiments,” or other similar language, throughout this specification do not necessarily all refer to the same group of embodiments, and the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. [0087] One having ordinary skill in the art will readily understand that the embodiments as discussed above may be practiced with steps in a different order, and/or with elements in configurations that are different than those which are disclosed. Therefore, although this disclosure considers the outlined embodiments, it would be apparent to those of skill in the art that certain modifications, variations, and alternative constructions would be apparent, while remaining within the spirit and scope of this disclosure. In order to determine the metes and bounds of the disclosure, therefore, reference should be made to the appended claims.

Claims

WHAT IS CLAIMED IS: 1. A method of generating a demand model for a potential hotel customer of a hotel room, the method comprising: based on features of the potential hotel customer, forming a plurality of clusters, each cluster comprising a corresponding weight and cluster probabilities; generating an initial estimated mixture of multinomial logit (MNL) models corresponding to each of the plurality of clusters, the mixture of MNL models comprising a weighted likelihood function based on the features and the weights; determining revised cluster probabilities and updating the weights; estimating an updated estimated MNL models and maximizing the weighted likelihood function based on the revised cluster probabilities and updated weights; and based on the update weights and updated estimated mixture of MNL models, generating the demand model that is adapted to predict a choice probability of room categories and rate code combinations for the potential hotel customer. 2. The method of claim 1, wherein the features of the potential hotel customer are known when the potential customer requests the hotel room. 3. The method of claim 1, the generating the estimated mixture of MNL models is based on offered prices, room category and rate plan position in the offer, and room and rate features. 4. The method of claim 1, wherein the forming the plurality of clusters comprises unsupervised machine learning. 5. The method of claim 4, wherein the unsupervised machine learning comprises one of dynamic clustering or soft clustering using Gaussian mixture models. 6. The method of claim 1, wherein the determining and estimating are repeated until a convergence criteria is reached, and the demand model is generated after the convergence criteria is reached. 7. The method of claim 1, wherein the features comprise at least one of: an arrival date and time, a number in a party, a booking channel or a booking window. 8. The method of claim 1, wherein the demand model is adapted to maximize revenue for the hotel room. 9. A computer readable medium having instructions stored thereon that, when executed by one or more processors, cause the processors to generate a demand model for a potential hotel customer of a hotel room, the generating the demand model comprising: based on features of the potential hotel customer, forming a plurality of clusters, each cluster comprising a corresponding weight and cluster probabilities; generating an initial estimated mixture of multinomial logit (MNL) models corresponding to each of the plurality of clusters, the mixture of MNL models comprising a weighted likelihood function based on the features and the weights; determining revised cluster probabilities and updating the weights; estimating an updated estimated mixture of MNL models and maximizing the weighted likelihood function based on the revised cluster probabilities and updated weights; and based on the update weights and updated estimated mixture of MNL models, generating the demand model that is adapted to predict a choice probability of room categories and rate code combinations for the potential hotel customer. 10. The computer readable medium of claim 9, wherein the features of the potential hotel customer are known when the potential customer requests the hotel room. 11. The computer readable medium of claim 9, the generating the estimated mixture of MNL models is based on offered prices, room category and rate plan position in the offer, and room and rate features. 12. The computer readable medium of claim 9, wherein the forming the plurality of clusters comprises unsupervised machine learning. 13. The computer readable medium of claim 12, wherein the unsupervised machine learning comprises one of dynamic clustering or soft clustering using Gaussian mixture models. 14. The computer readable medium of claim 9, wherein the determining and estimating are repeated until a convergence criteria is reached, and the demand model is generated after the convergence criteria is reached. 15. The computer readable medium of claim 9, wherein the features comprise at least one of: an arrival date and time, a number in a party, a booking channel or a booking window. 16. The computer readable medium of claim 9, wherein the demand model is adapted to maximize revenue for the hotel room. 17. A hotel reservation system that generates a demand model for a potential hotel customer of a hotel room comprising: one or more processors coupled to stored instructions; and a database storing historical booking data; the processors configured to: based on features of the potential hotel customer, forming a plurality of clusters, each cluster comprising a corresponding weight and cluster probabilities; generating an initial estimated mixture of multinomial logit (MNL) models corresponding to each of the plurality of clusters, the mixture of MNL models comprising a weighted likelihood function based on the features and the weights; determining revised cluster probabilities and updating the weights; estimating an updated estimated mixture of MNL models and maximizing the weighted likelihood function based on the revised cluster probabilities and updated weights; and based on the update weights and updated estimated mixture of MNL models, generating the demand model that is adapted to predict a choice probability of room categories and rate code combinations for the potential hotel customer. 18. The hotel reservation system of claim 17, wherein the features of the potential hotel customer are known when the potential customer requests the hotel room. 19. The hotel reservation system of claim 17, the generating the estimated MNL models is based on offered prices, room category and rate plan position in the offer, and room and rate features. 20. The hotel reservation system of claim 17, wherein the forming the plurality of clusters comprises unsupervised machine learning.
PCT/US2022/072854 2021-06-28 2022-06-09 Artificial intelligence based hotel demand model Ceased WO2023278935A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN202280043130.6A CN117501286A (en) 2021-06-28 2022-06-09 Hotel demand model based on artificial intelligence
CA3222594A CA3222594A1 (en) 2021-06-28 2022-06-09 Artificial intelligence based hotel demand model
JP2023577727A JP2024523377A (en) 2021-06-28 2022-06-09 Artificial Intelligence-Based Hotel Demand Model

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202163215688P 2021-06-28 2021-06-28
US63/215,688 2021-06-28
US17/399,342 US12014310B2 (en) 2021-06-28 2021-08-11 Artificial intelligence based hotel demand model
US17/399,342 2021-08-11

Publications (1)

Publication Number Publication Date
WO2023278935A1 true WO2023278935A1 (en) 2023-01-05

Family

ID=82361178

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2022/072854 Ceased WO2023278935A1 (en) 2021-06-28 2022-06-09 Artificial intelligence based hotel demand model

Country Status (3)

Country Link
JP (1) JP2024523377A (en)
CA (1) CA3222594A1 (en)
WO (1) WO2023278935A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070130201A1 (en) * 2005-12-05 2007-06-07 Sabre Inc. System, method, and computer program product for synchronizing price information among various sources of price information
US20210117998A1 (en) * 2019-10-21 2021-04-22 Oracle International Corporation Artificial Intelligence Based Room Personalized Demand Model

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070130201A1 (en) * 2005-12-05 2007-06-07 Sabre Inc. System, method, and computer program product for synchronizing price information among various sources of price information
US20210117998A1 (en) * 2019-10-21 2021-04-22 Oracle International Corporation Artificial Intelligence Based Room Personalized Demand Model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ELAHEH FATA ET AL: "Multi-stage and Multi-customer Assortment Optimization with Inventory Constraints", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 26 July 2020 (2020-07-26), XP081703603 *

Also Published As

Publication number Publication date
JP2024523377A (en) 2024-06-28
CA3222594A1 (en) 2023-01-05

Similar Documents

Publication Publication Date Title
JP7565265B2 (en) AI-based personalized room demand model
Sun et al. Collaborative intent prediction with real-time contextual data
US11120361B1 (en) Training data routing and prediction ensembling at time series prediction system
US10748072B1 (en) Intermittent demand forecasting for large inventories
US12014310B2 (en) Artificial intelligence based hotel demand model
US20070156479A1 (en) Multivariate statistical forecasting system, method and software
JP7308262B2 (en) Dynamic data selection for machine learning models
US20140006166A1 (en) System and method for determining offers based on predictions of user interest
CN113656691B (en) Data prediction method, device and storage medium
US20230186411A1 (en) Optimized Hotel Room Display Ordering Based On Heterogenous Customer Dynamic Clustering
Desirena et al. Maximizing customer lifetime value using stacked neural networks: An insurance industry application
Borgonovo et al. Global sensitivity analysis via optimal transport
US11151631B2 (en) Quick learning recommendation method, non-transitory computer readable medium and system for baskets of goods
US20170286877A1 (en) System and method for resource planning with substitutable assets
Shen et al. Enhancing stochastic kriging for queueing simulation with stylized models
WO2023103527A1 (en) Access frequency prediction method and device
US20230376861A1 (en) Artificial Intelligence Based Upsell Model
Almomani et al. Application of choice models in tourism recommender systems
JP7809104B2 (en) A discrete choice hotel room demand model
Li et al. Dynamic inventory allocation for seasonal merchandise at dillard’s
Alamdari et al. Deep reinforcement learning in seat inventory control problem: an action generation approach: NE Alamdari, G. Savard
WO2023278935A1 (en) Artificial intelligence based hotel demand model
Wang Internet usage prediction in cellular networks by ensemble of deep belief networks (DBNs) and particle swarm optimization (PSO)
CN117501286A (en) Hotel demand model based on artificial intelligence
US12243067B2 (en) Machine learning based federated learning with hierarchical modeling hotel upsell

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22736461

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 3222594

Country of ref document: CA

ENP Entry into the national phase

Ref document number: 2023577727

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 202280043130.6

Country of ref document: CN

WWE Wipo information: entry into national phase

Ref document number: 202347088991

Country of ref document: IN

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22736461

Country of ref document: EP

Kind code of ref document: A1