US20140280065A1 - Systems and methods for predictive query implementation and usage in a multi-tenant database system - Google Patents
Systems and methods for predictive query implementation and usage in a multi-tenant database system Download PDFInfo
- Publication number
- US20140280065A1 US20140280065A1 US14/014,204 US201314014204A US2014280065A1 US 20140280065 A1 US20140280065 A1 US 20140280065A1 US 201314014204 A US201314014204 A US 201314014204A US 2014280065 A1 US2014280065 A1 US 2014280065A1
- Authority
- US
- United States
- Prior art keywords
- data
- veritable
- user
- predictive
- gui
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
-
- G06F17/30424—
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2433—Query languages
- G06F16/244—Grouping and aggregation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2433—Query languages
- G06F16/2445—Data retrieval commands; View definitions
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24553—Query execution of query operations
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24553—Query execution of query operations
- G06F16/24554—Unary operations; Data partitioning operations
- G06F16/24556—Aggregation; Duplicate elimination
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24553—Query execution of query operations
- G06F16/24558—Binary matching operations
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2457—Query processing with adaptation to user needs
- G06F16/24578—Query processing with adaptation to user needs using ranking
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2471—Distributed queries
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/248—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0484—Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
- G06F3/04842—Selection of displayed objects or displayed text elements
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0484—Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
- G06F3/04847—Interaction techniques to control parameter settings, e.g. interaction with sliders or dials
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
- G06F40/177—Editing, e.g. inserting or deleting of tables; using ruled lines
- G06F40/18—Editing, e.g. inserting or deleting of tables; using ruled lines of spreadsheets
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
- G06Q30/0202—Market predictions or forecasting for commercial activities
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/316—Indexing structures
Definitions
- Embodiments of the invention relate generally to the field of computing, and more particularly, to systems and methods for predictive query implementation and usage in a multi-tenant database system including means for implementing predictive population of null values with confidence scoring, means for predictive scoring and reporting of business opportunities with probability to close scoring, and other related embodiments.
- FIG. 1 depicts an exemplary architecture in accordance with described embodiments
- FIG. 2 illustrates a block diagram of an example of an environment in which an on-demand database service might be used
- FIG. 3 illustrates a block diagram of an embodiment of elements of FIG. 2 and various possible interconnections between these elements;
- FIG. 4 illustrates a diagrammatic representation of a machine in the exemplary form of a computer system, in accordance with one embodiment
- FIG. 5A depicts a tablet computing device and a hand-held smartphone each having a circuitry integrated therein as described in accordance with the embodiments;
- FIG. 5B is a block diagram of an embodiment of tablet computing device, a smart phone, or other mobile device in which touchscreen interface connectors are used;
- FIG. 6 depicts a simplified flow for probabilistic modeling
- FIG. 7 illustrates an exemplary landscape upon which a random walk may be performed
- FIG. 8 depicts an exemplary tabular dataset
- FIG. 9 depicts means for deriving motivation or causal relationships between observed data
- FIG. 10A depicts an exemplary cross-categorization in still further detail
- FIG. 10B depicts an assessment of convergence, showing inferred versus ground truth
- FIG. 11 depicts a chart and graph of the Bell number series
- FIG. 12A depicts an exemplary cross categorization of a small tabular dataset
- FIG. 12B depicts an exemplary architecture having implemented data upload, processing, and predictive query API exposure in accordance with described embodiments
- FIG. 12C is a flow diagram illustrating a method for implementing data upload, processing, and predictive query API exposure in accordance with disclosed embodiments
- FIG. 12D depicts an exemplary architecture having implemented predictive query interface as a cloud service in accordance with described embodiments
- FIG. 12E is a flow diagram illustrating a method for implementing predictive query interface as a cloud service in accordance with disclosed embodiments
- FIG. 13A illustrates usage of the RELATED command term in accordance with the described embodiments
- FIG. 13B depicts an exemplary architecture in accordance with described embodiments
- FIG. 13C is a flow diagram illustrating a method in accordance with disclosed embodiments.
- FIG. 14A illustrates usage of the GROUP command term in accordance with the described embodiments
- FIG. 14B depicts an exemplary architecture in accordance with described embodiments
- FIG. 14C is a flow diagram illustrating a method in accordance with disclosed embodiments.
- FIG. 15A illustrates usage of the SIMILAR command term in accordance with the described embodiments
- FIG. 15B depicts an exemplary architecture in accordance with described embodiments
- FIG. 15C is a flow diagram illustrating a method in accordance with disclosed embodiments.
- FIG. 16A illustrates usage of the PREDICT command term in accordance with the described embodiments
- FIG. 16B illustrates usage of the PREDICT command term in accordance with the described embodiments
- FIG. 16C illustrates usage of the PREDICT command term in accordance with the described embodiments
- FIG. 16D depicts an exemplary architecture in accordance with described embodiments
- FIG. 16E is a flow diagram illustrating a method in accordance with disclosed embodiments.
- FIG. 16F depicts an exemplary architecture in accordance with described embodiments
- FIG. 16G is a flow diagram illustrating a method in accordance with disclosed embodiments.
- FIG. 17A depicts a Graphical User Interface (GUI) to display and manipulate a tabular dataset having missing values by exploiting a PREDICT command term;
- GUI Graphical User Interface
- FIG. 17B depicts another view of the Graphical User Interface
- FIG. 17C depicts another view of the Graphical User Interface
- FIG. 17D depicts an exemplary architecture in accordance with described embodiments.
- FIG. 17E is a flow diagram illustrating a method in accordance with disclosed embodiments.
- FIG. 18 depicts feature moves and entity moves within indices generated from analysis of tabular datasets
- FIG. 19A depicts a specialized GUI to query using historical dates
- FIG. 19B depicts an additional view of a specialized GUI to query using historical dates
- FIG. 19C depicts another view of a specialized GUI to configure predictive queries
- FIG. 19D depicts an exemplary architecture in accordance with described embodiments.
- FIG. 19E is a flow diagram illustrating a method in accordance with disclosed embodiments.
- FIG. 20A depicts a pipeline change report in accordance with described embodiments
- FIG. 20B depicts a waterfall chart using predictive data in accordance with described embodiments
- FIG. 20C depicts an interface with defaults after adding a first historical field
- FIG. 20D depicts in additional detail an interface with defaults for an added custom filter
- FIG. 20E depicts another interface with defaults for an added custom filter
- FIG. 20F depicts an exemplary architecture in accordance with described embodiments.
- FIG. 20G is a flow diagram illustrating a method in accordance with disclosed embodiments.
- FIG. 21A provides a chart depicting prediction completeness versus accuracy
- FIG. 21B provides a chart depicting an opportunity confidence breakdown
- FIG. 21C provides a chart depicting an opportunity win prediction
- FIG. 22A provides a chart depicting predictive relationships for opportunity scoring
- FIG. 22B provides another chart depicting predictive relationships for opportunity scoring.
- FIG. 22C provides another chart depicting predictive relationships for opportunity scoring.
- Client organizations with datasets in their databases can benefit from predictive analysis. Unfortunately, there is no low cost and scalable solution in the marketplace today. Instead, client organizations must hire technical experts to develop customized mathematical constructs and predictive models which are very expensive. Consequently, client organizations without vast financial means are simply priced out of the market and thus do not have access to predictive analysis capabilities for their datasets.
- Veritable offers a predictive database and additional commands and verbs so that a non-expert user can query the predictive database with inquiries such as: “predict revenue from users where age is greater than 35.”
- Veritable provides a predictive database that is not anchored to any particular underlying dataset, it remains useful as data and data structures change over time. For instance, data analysis performed by the Veritable core may simply be re-applied to a changed dataset. There is no need to re-hire experts or re-tool the models.
- salesforce.com specifically, the company offers cloud services to clients, organizations, and end users, and behind those cloud services is a multi-tenant database system which permits users to have customized data, customized field types, and so forth.
- the underlying data and data structures are customized by the client organizing for their own particular needs. Veritable may nevertheless be utilized on these varying datasets and data structures because it is not anchored to a particular underlying database scheme, structure, or content.
- the cloud service provider may elect to provide the capability as part of an overall service offering at no additional cost, or may elect to provide the additional capabilities for an additional service fee.
- the Veritable capabilities are systematically integrated into the cloud service's computing architecture and do not require experts to custom tailor a solution to each particular client organizations' dataset and structure, the scalability brings massive cost savings, thus enabling our exemplary small company with limited financial resources to go from a 0% capability because they cannot afford to hire technical experts from KXEN to, for instance, a 95% accuracy capability using Veritable.
- embodiments further include various operations which are described below.
- the operations described in accordance with such embodiments may be performed by hardware components or may be embodied in machine-executable instructions, which may be used to cause a general-purpose or special-purpose processor programmed with the instructions to perform the operations.
- the operations may be performed by a combination of hardware and software.
- Embodiments also relate to an apparatus for performing the operations disclosed herein.
- This apparatus may be specially constructed for the required purposes, or it may be a general purpose computer selectively activated or reconfigured by a computer program stored in the computer.
- a computer program may be stored in a computer readable storage medium, such as, but not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, and magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, or any type of media suitable for storing electronic instructions, each coupled to a computer system bus.
- Embodiments may be provided as a computer program product, or software, that may include a machine-readable medium having stored thereon instructions, which may be used to program a computer system (or other electronic devices) to perform a process according to the disclosed embodiments.
- a machine-readable medium includes any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computer).
- a machine-readable (e.g., computer-readable) medium includes a machine (e.g., a computer) readable storage medium (e.g., read only memory (“ROM”), random access memory (“RAM”), magnetic disk storage media, optical storage media, flash memory devices, etc.), a machine (e.g., computer) readable transmission medium (electrical, optical, acoustical), etc.
- a machine e.g., a computer readable storage medium
- ROM read only memory
- RAM random access memory
- magnetic disk storage media e.g., magnetic disks, optical storage media, flash memory devices, etc.
- a machine (e.g., computer) readable transmission medium electrical, optical, acoustical
- any of the disclosed embodiments may be used alone or together with one another in any combination.
- various embodiments may have been partially motivated by deficiencies with conventional techniques and approaches, some of which are described or alluded to within the specification, the embodiments need not necessarily address or solve any of these deficiencies, but rather, may address only some of the deficiencies, address none of the deficiencies, or be directed toward different deficiencies and problems where are not directly discussed.
- means for predictive query implementation and usage in a multi-tenant database system execute at an application in a computing device, a computing system, or a computing architecture, in which the application is enabled to communicate with a remote computing device over a public Internet, such as remote clients, thus establishing a cloud based computing service in which the clients utilize the functionality of the remote application which implements the predictive query and usage capabilities.
- Model-based clustering techniques including inference in Dirichlet process mixture models, have difficulty when different dimensions are best explained by very different clusterings.
- MCMC inference in a novel nonparametric Bayesian model methods automatically discover the number of independent nonparametric Bayesian models needed to explain the data, using a separate Dirichlet process mixture model for each group in an inferred partition of the dimensions.
- the disclosed model is exchangeable over both the rows of a heterogeneous data array (the samples) and the columns (new dimensions), and can model any dataset as the number of samples and dimensions both go to infinity. Efficiency and robustness is improved through use of algorithms described herein which in certain instances require no preprocessing to identify veridical causal structure in provided raw datasets.
- Clustering techniques are widely used in data analysis for problems of segmentation in industry, exploratory analysis in science, and as a preprocessing step to improve performance of further processing in distributed computing and in data compression.
- datasets grow larger and noisier the assumption that a single clustering or distribution over clusterings can account for all the variability in the observations becomes less realistic if not wholly infeasible.
- a robust clustering method should be able to ignore an infinite number of uniformly random or perfectly deterministic measurements.
- the assumption that a single nonparametric model must explain all the dimensions is partly responsible for the accuracy issues Dirichlet process mixtures often encounter in high dimensional settings.
- DP mixture based classifiers via class conditional density estimation highlight the problem. For instance, while a discriminative classifier can assign low weight to noisy or deterministic and therefore irrelevant dimensions, a generative model must explain them. If there are enough irrelevancies, it ignores the dimensions relevant to classification in the process. Combined with slow MCMC convergence, these difficulties have inhibited the use of nonparametric Bayesian methods in many applications.
- cross-categorization is utilized, which is an unsupervised learning technique for clustering based on MCMC inference in a novel nested nonparametric Bayesian model.
- This model can be viewed as a Dirichlet process mixture, over the dimensions or columns, of Dirichlet process mixture models over sampled data points or rows.
- our model reduces to an independent product of DP mixtures, but the partition of the dimensions, and therefore the number and domain of independent nonparametric Bayesian models, is also inferred from the data.
- Standard feature selection results in the case where the partition of dimensions has only two groups.
- the described model utilizes MCMC because both model selection and deterministic approximations seem intractable due to the combinatorial explosion of latent variables, with changing numbers of latent variables as the partition of the dimensions changes.
- a generative process viewed as a model for heterogeneous data arrays with N rows, D columns of fixed type and values missing at random, can be described as follows:
- the model encodes a very different inductive bias than the IBP, discovering independent systems of categories over heterogeneous data vectors, as opposed to features that are typically additively combined. It is also instructive to contrast the asymptotic capacity of our model with that of a Dirichlet process mixture.
- the DP mixture has arbitrarily large asymptotic capacity as the number of samples goes to infinity. Put differently, it can model any distribution over finite dimensional vectors given enough data. However, if the number of dimensions (or features) is taken to infinity, it is no longer asymptotically consistent: if we generate a sequence of datasets by sampling the first K 1 dimensions from a mixture and then append K 2 >>K 1 dimensions that are constant valued (e.g.
- the model has asymptotic capacity both in terms of the number of samples and the number of dimensions, and is infinitely exchangeable with respect to both quantities.
- the algorithm builds upon a general-purpose MCMC algorithm for probabilistic programs and specializes three of the kernels. It scales linearly per iteration in the number of rows and columns and includes inference over all hyperparameters.
- FIG. 10B depicts an assessment of convergence, showing inferred versus ground truth.
- an assessment of convergence showing inferred versus ground truth joint score for ⁇ 1000 MCMC runs (200 iterations each) with varying dataset sizes (up to 512 by 512, requiring ⁇ 1-10 minutes each) and true dimension groups.
- a strong majority of points fall near the ground truth dashed line, indicating reasonable convergence; perfect linearity is not expected, partly due to posterior uncertainty.
- a co-assignment matrix for dimensions where:
- An example may include data for 4273 hospitals by 74 variables, including quality scores and various spending measurements.
- the data is analyzed ( ⁇ 1 hour for convergence) with no preprocessing or missing data imputation.
- Each box contains one consensus dimension group and the number of categories according to that group.
- no causal dependence between quality of care, hospital capacity, and spending is found, though each kind of measurement results in a different clustering of the hospitals. Also recovered is the cost structure of modern hospitals (e.g. increased long term care causes increased ambulance costs, likely due to an increase in at-home mishaps). Standard clustering methods miss most of this type cross-cutting structure.
- Veritable and associated Veritable APIs make use of a predictive database that finds the causes behind data and uses these causes to predict and explain the future in a highly automated fashion heretofore unavailable, thus allowing any developer to carry out scientific inquires against a dataset without requiring custom programming and consultation with mathematicians and other such experts.
- Veritable works by searching through the massive hypothesis space of all possible relationships present in a dataset, using an advanced Bayesian machine learning algorithm.
- the described Veritable technologies offer developers: state of the art inference performance and predictive accuracy on a very wide range of real-world datasets, with no manual parameter tuning whatsoever; scalability to very large datasets, including very high-dimensional data with hundreds of thousands or millions of columns; completely flexible predictions (e.g., predict the value of any subset of columns, given values for any other subset) without any retraining or adjustment; and quantification of the uncertainty associated with its predictions, since the system is built around a fully Bayesian probability model.
- a system is needed that can make sense of data as it exists in real businesses and does not require an pristine dataset or conform to typical platonic ideals of what data should look like.
- a system is needed that can be queried for many different questions about many different variables, in real time.
- a system is needed which is capable of getting at the hidden structures in such data, that is, which variables matter and what are the segments or groups within the data.
- the system must be trustworthy, that is, it can't lie to the users by providing erroneous relationships and predictions.
- Such a system shouldn't reveal things that aren't true and shouldn't report ghost patterns may exist in a first dataset, but won't hold up overall.
- Such desirable characteristics are exceedingly difficult to attain with customized statistical analysis and customized predictive modeling, and wholly unheard of in automated systems.
- the resulting database appears to its users much like a traditional database. But instead of selecting columns from existing rows, users may issue predictive query requests via a structured query language. Such a structured language, rather than SQL may be referred to as Predictive Query Language (“PreQL”). PreQL is not to be confused with PQL which is short for the “Program Query Language.”
- PreQL is thus used to issue queries against the database to predict values.
- Such a PreQL query offers the same flexibility as SQL-style queries.
- users may issue PreQL queries seeking notions of similarity that are hidden or latent in the overall data without advanced knowledge of what those similarities may be.
- PreQL queries seeking notions of similarity that are hidden or latent in the overall data without advanced knowledge of what those similarities may be.
- Veritable utilizes a specially customized probabilistic model based upon foundational CrossCat modeling.
- CrossCat is a good start but could nevertheless be improved.
- prior models matched data with the model to understand hidden structure, like building a probabilistic index, but was so complex that its users literally required advanced mathematics and probability theory understanding simply to implement the models for any given dataset, rendering mere mortals incapable of realistically using such models.
- Veritable implementations described herein provide a service which includes distributed processing, job scheduling, persistence, check-pointing, and a user-friendly API or front-end interface which accepts users' questions and queries via the PreQL query structure.
- Other specialized front end GUIs and interfaces are additionally described to solve for particular use cases on behalf of users and provide other simple interfaces to complex problems of probability.
- Probabilities are assigned relative to knowledge or information context. Different observers can have different knowledge, and assign different probabilities to same event, or assign different probabilities even when both observers have the same knowledge. Probability, as used herein, is a number between “0” (zero) and “1” (one), in which 0 means the event is sure to not occur on one extreme of a continuum and where 1 means the event is sure to occur on the other extreme of the same continuum. Both extremes are uninteresting because they represent a complete absence of uncertainty.
- a probability ties belief to one event.
- a probability distribution ties beliefs to every possible event, or at least, every event we want to consider. Choosing the outcome space is an important modeling decision. Summed over all outcomes in space is total probability which must be a total of “1,” that is to say, one of the outcomes must occur.
- Probability distributions are convenient mathematical forms that help summarize the system's beliefs in the various probabilities, but choosing a standard distribution is a modeling choice in which all models are wrong, but some are useful.
- Poisson distribution which is a good model when some event can occur 0 or more times in a span of time.
- the outcome space is the number of times the event occurs.
- the Poisson distribution has a single parameter, which is the rate ⁇ the average number of times. Its mathematical form has some nice properties: Defined for all the non-negative integers Sums to 1.
- Veritable utilizes distributions which move beyond the standard distributions with specially customized modeling thus allowing for a more complex outcome space and further allowing for more complex ways of assigning probabilities to outcomes.
- Depicted at slide 8B above is what's called a mixture distribution combining a bunch of simpler distributions to form a more complex one.
- a mixture of Gaussians to model any distribution may be employed, while still assigning probabilities to outcomes, yielding a more involved mathematical relationship.
- a Mondrian process defines a distribution on k-dimensional trees, providing means for dividing up a square or a cube.
- the outcome space is all possible trees and resulting divisions look like the famous painting.
- the outcome space is more structured than what is offered by the standard distributions.
- CrossCat does not use the Mondrian process, but it does use a structured outcome space. Veritable utilizes the Mondrian process in select embodiments.
- probability theory is a generalization of logic. Just like computers can use logic to reason deductively, probability lets computer reason inductively, generalize, categorize, etc. Probability gives us a way to combine different sources of information in a systematic manner, that is, utilizing automated computer implemented functionality, even when that information is vague, or uncertain, or ambiguous.
- FIG. 6 depicts a simplified flow for probabilistic modeling.
- Modeling is a series of choices and assumptions. For instance, it is possible to trade off fidelity and detail with tractability. Assumptions define an outcome space. Such an outcome space may be considered hypotheses, and in the modeling view, one of these possible hypotheses actually occurs. This is the hidden structure, and it is this hidden structure that generates the data. The hidden structure and the resulting generated data may be considered the generative view. For learning or inference perspectives sources of information about the hidden structure may include certain modeling assumptions (“prior”), as well as data observed (“likelihood”), from which a combination of prior and likelihood may be utilized to draw conclusions (“posterior”).
- prior modeling assumptions
- likelihood data observed
- Modeling makes assumptions and using the assumptions defines a hypothesis space. Probabilities are assigned to the hypotheses given data observed and then inference is used to figure out which of those explanatory hypotheses is the best or are plausible.
- FIG. 7 illustrates an exemplary landscape upon which a random walk may be performed.
- each axis is one dimension in the hypothesis space.
- Real spaces can have very many dimensions. Height of surface is probability of the hidden variables, given data and modeling assumptions. Exploration starts by taking a random step somewhere, anywhere, and if the step is higher then it is kept, but if the step is lower, then it is kept sometimes and other times it is not, electing to stay put instead. The result is seemingly magic, in which it is guaranteed to explore the space in proportion to the true probability values. Over the long run two peaks result in this example whereas simple hill climbing will get caught. Such an approach thus explores the whole space.
- Other innovation includes added intelligence about jumps and exploring one or many dimensions at a time.
- tabular data is provided such that rows are individual entities and columns contain a certain piece of information about the respective entity (e.g., field). Different kinds of columns in one table are acceptable.
- FIG. 8 depicts an exemplary tabular dataset.
- rows are mammals and columns are variables that describe the mammals. Most are Boolean but some are categorical.
- FIG. 9 depicts means for deriving motivation or causal relationships between observed data.
- real-world data described in prior slides in which it is proffered that real-world data is not pristine. For instance, there may be data which simply doesn't matter. For instance, Some columns may not matter or certain columns may carry redundant information.
- a system is needed which utilizes a model that can understand the predictive relationships between all the columns. That is, some columns are predictively related and should get grouped together whereas others are not predictively related, and should be grouped separately. We call these groups of columns “views.” Within each view, the rows are grouped into categories.
- FIG. 10A depicts an exemplary cross-categorization in still further detail.
- a single cross-categorization is a particular way of slicing and dicing the table. First by column and then by row. It's a particular kind of process to yield a desired structured space. Utilizing concepts discussed above with respect to probability distributions, a probability is then assigned to each cross-categorization. More complex cross-categorizations yielding more views and more categories are less probable in and of themselves and typically are warranted only when the data really supports them.
- FIG. 11 depicts a chart and graph of the Bell number series.
- a series called the Bell numbers defines the number of partitions for n labeled objects which as can be seen from the above graph on the right, grows really, really fast. A handful of objects are exemplified in the chart on the left. Plotted on the right is a plot through 200 resulting in 1e+250 or a number with 250 zeros.
- each column contains a single kind of data so each vertical strip within a category contains typed data such as numerical, categorical, etc.
- the basic standardized distributions may be utilized more effectively.
- each collection of points is modeled with a single simple distribution.
- Basic distributions are pre-selected which work well for each data type and each selected basic distribution is only responsible for explaining a small subset of the actual data, for which it is particularly useful.
- the basic distributions are combined such that a bunch of simple distributions are used to make a more complex one.
- the structure of the cross-categorization is used to chop up the data table into a bunch of pieces and each piece is modeled using the simple distribution selected of the data type, yielding a big mixture distribution of the data.
- FIG. 12A depicts an exemplary cross categorization of a small tabular dataset.
- the view on the right has the habitat and feeding style columns and the rows are divided into four categories land mammals (Persian cat through Zebra), sea predators (dolphin through walrus), baleen whales (blue whale and humpback whale only), and the outlier amphibious beaver (e.g., both land and water living; we do not suggest that mammal beavers have gills).
- the view on the left has another division in which the primates are grouped together, large mammals are grouped, grazers are grouped, and then a couple of data oddities at the bottom (bat and seal).
- Results are thus not limited to a single cross-categorization. Instead, a collection of them are utilized and such a collection when used together tells us about the hidden structure of the data. For instance, if they're all the same, then there was no ambiguity in the data, but such a result doesn't occur with real-world data, despite being a theoretical possibility. Conversely, if they're all completely different, that means we could't find any structure in the data, which sometimes happens, and requires some additional post-processing to get at the uncertainty, such as feeding in additional noise. Typically, however, something in between occurs, and some interesting hidden structure is revealed from the data.
- the specially customized cross-categorization implementation represents the core of Veritable. This core is not directly exposed to the users who interface via APIs, PreQL, and specialized utility GUIs and interfaces, but such users nevertheless benefit from the functionality which drives these other capabilities.
- the Veritable core utilizes Monte Carlo methods for certain embodiments.
- FIG. 12B depicts an exemplary architecture having implemented data upload, processing, and predictive query API exposure in accordance with described embodiments.
- API calls are provided to upload data into the system.
- a row is the basic unit.
- An API call for “Analyze” kicks off a learning pass applying the specially customized CrossCat model for the uploaded data. It's also possible to specify an existing dataset, or to define a sub-set of data from a larger dataset.
- FIG. 12C is a flow diagram illustrating a method for implementing data upload, processing, and predictive query API exposure in accordance with disclosed embodiments.
- FIG. 12D depicts an exemplary architecture having implemented predictive query interface as a cloud service in accordance with described embodiments.
- FIG. 12E is a flow diagram illustrating a method for implementing predictive query interface as a cloud service in accordance with disclosed embodiments.
- FIG. 13A illustrates usage of the RELATED command term in accordance with the described embodiments.
- FIG. 13B depicts an exemplary architecture in accordance with described embodiments.
- FIG. 13C is a flow diagram illustrating a method in accordance with disclosed embodiments.
- FIG. 14A illustrates usage of the GROUP command term in accordance with the described embodiments.
- FIG. 14B depicts an exemplary architecture in accordance with described embodiments.
- FIG. 14C is a flow diagram illustrating a method in accordance with disclosed embodiments.
- FIG. 15A illustrates usage of the SIMILAR command term in accordance with the described embodiments.
- FIG. 15B depicts an exemplary architecture in accordance with described embodiments.
- FIG. 15C is a flow diagram illustrating a method in accordance with disclosed embodiments.
- Rows can be similar in one context but dissimilar in another. For instance, killer whales and blue whales are a lot alike in some respects, but very different in others.
- the input column disambiguates.
- FIG. 16A illustrates usage of the PREDICT command term in accordance with the described embodiments.
- PreQL Using PreQL, we can predict, that is, ask the system to render a prediction. With the cross-categorizations a prediction request is treated as a new row and we assign that row to categories in each cross-categorization. Then using the basic standardized distributions for each category the values we want to predict are predicted. Unlike conventional predictive analytics, the system provides for flexible predictive queries thus allowing the user of the PreQL query to specify as many or as few columns as they desire and thus allowing the system to predict as many or as few as the user wants.
- Veritable's core via the APIs, can predict using a single target column or can predict using a few target columns at the user's discretion. For instance, a user can query the system asking: will an opportunity close AND at what amount? Such capabilities do not exist using conventional means.
- FIG. 16B illustrates usage of the PREDICT command term in accordance with the described embodiments.
- FIG. 16C illustrates usage of the PREDICT command term in accordance with the described embodiments.
- predict can be utilized to predict a row without fixing anything, thus asking Veritable to make up a row that isn't actually in the underlying source data, but could be nevertheless, resulting in what may be considered a synthetic row.
- Such a row will exhibit all of the structure and predictive relationships as in the real data.
- Such a capability may enable a user to test a dataset that's realistic, but not radioactive, without having to manually enter or guess at what such data may look like.
- FIG. 16D depicts an exemplary architecture in accordance with described embodiments.
- FIG. 16E is a flow diagram illustrating a method in accordance with disclosed embodiments.
- FIG. 16F depicts an exemplary architecture in accordance with described embodiments.
- FIG. 16G is a flow diagram illustrating a method in accordance with disclosed embodiments.
- the user can take an incomplete row and predict all of the missing values to fill in the blanks.
- the user can begin a table with many missing values and render a table where all of the blanks were filled in.
- Specialized tools for this particular use case are discussed below in which functionality allows the user to trade off confidence for more or less data, such that more data (or all the data) can be populated with degrading confidence or only some data is populated, above a given confidence, and so forth.
- a specialized GUI is additionally provided and described for this particular use case. Such a GUI calls the predict query via PreQL via an API on behalf of the user, but fundamentally exercises Veritable's core.
- FIG. 17A depicts a Graphical User Interface (GUI) to display and manipulate a tabular dataset having missing values by exploiting a PREDICT command term.
- GUI Graphical User Interface
- the table is provided as being 61% filled. No values are predicted, but the user may simply move the slider to increase the data fill for the missing values, causing the GUI's functionality to utilize the predict function on behalf of the user.
- FIG. 17B depicts another view of the Graphical User Interface.
- the table is provided as being 73% filled. Some but not all values are predicted. Not depicted here is the confidence threshold which is hidden from the user. Alternative interfaces allow the user to specify such a threshold.
- FIG. 17C depicts another view of the Graphical User Interface.
- the table is provided as being 100% filled. All values are predicted, but it may be necessary to degrade the confidence somewhat to attain 100% fill. Though such a fill may nevertheless be feasible at acceptable levels of confidence.
- the grey scale values show the original data and the blue values depict the predicted values which do not actually exist in the underlying table.
- the chosen fill level selected by the user via the slider bar, can be “saved” to the original or a copy of the table, thus resulting in the predictive values provided being saved or input to the cell locations. Meta data can be used to recognize later that such values were predicted and not sourced.
- FIG. 17D depicts an exemplary architecture in accordance with described embodiments.
- FIG. 17E is a flow diagram illustrating a method in accordance with disclosed embodiments.
- GUIs and API tools include business opportunity scoring, next best offer, etc. These and other are described in additional detail below.
- the system can advise the user that it is 90% confident that the answer given is real, accurate, and correct, or the system may alternatively return a result indicating that it simply lacks sufficient data, and thus, there is not enough known to render a prediction.
- the Veritable core may be used to a complete data set as if it was real by filling in the missing but predicted information into a spreadsheet, from which the completed data may be used to draw conclusions, as is depicted in the above slides 33, 34, and 35.
- the control slider is feasible because when you complete income, for example, what is actually returned to the GUI functionality making the call is the respective persons' income distribution.
- the user is given control over accuracy and confidence.
- the user can manipulate how much data is to be filled in and to what extent the confidence level applies.
- What Veritable does behind the scenes is to take a table of data with a bunch of typed columns, and then the depicted Perceptible GUI interface at slides 33, 34, and 35 asks for a prediction for every single cell.
- the Perceptible GUI gets a distribution in return from the API call to the Veritable core for the individual cell.
- functionality for the slider looks at the distributions and looks at their variances and then gives the estimates.
- any given cell having a predicted result in place of the missing null value, having seen “a” and “b” and “c” it then returns a value for that column.
- an API call first made to upload or insert the data into the predictive database upon which Veritable operates and then an API call is made to the Veritable core to analyze the data.
- the data Upon insert, the data it looks just like all other data. But once uploaded and the analyze operation is initiated, a probabilistic model is executed against the data. So the Veritable core starts to look at the ways that the rows and the columns can interact with each other and start to build the various relationships and causations.
- a generated statistical index figures out how and which columns are related to another. Veritable goes through and says, for instance, these particular columns are likely to share a causal origin.
- Veritable must perform this analysis using real world realities rather than pristine and perfect datasets.
- some columns are junk, some columns are duplicates, some columns are heterogeneous, some columns are noisy with only sparse data, and Veritable's core functionality implementing the statistical index must be able to pull the appropriate relationships and causations out despite having to perform its analysis operations against real-world data.
- Veritable does not just build up one statistical model but rather, Veritable builds multiple statistical indices as a distribution or an ensemble of statistical indices. Veritable performs its analysis by searching through a large space for all of the ways that the data provided can possibly interact.
- the distribution of indices results in a model that is stored and is queryable by PreQL structured queries via Veritable's APIs. What Veritable figures out via the analysis operations is first how the columns group together and then how the various rows group together. The analysis thus discovers the hidden structure in the data to provide a reduced representation of a table that explains how rows and columns may be related such that they can be queried via PreQL.
- FIG. 18 depicts feature moves and entity moves within indices generated from analysis of tabular datasets.
- PreQL structured queries allow access to the queryable model and its ensemble of indices through specialized calls, including: “RELATED,” “SIMILAR,” “GROUP,” AND “PREDICT,” each of which are introduced above at slides 28 through 32.
- PREDICT Beginning with PREDICT, calling an appropriate API for the PREDICT functionality enables users to predict any chosen sub-set of data predict any column/value. It is not required that an entire dataset be utilized to predict only a single value, as is typical with custom implemented models.
- the user uses PREDICT to provide or fixes the value of any column and then the PREDICT API accepts the fixed values and those you want to predict.
- the functionality queries the Veritable core asking: “Given a row that has these values fixed, as provided by the user, then what would the distribution be?” For instance, the functionality could fix all but one and column in the dataset and then predict the last one, as is done with customized models.
- the PREDICT functionality is far more flexible, and thus, the user can change the column to be predicted at a whim and custom implemented models simply lack this functionality as they lack the customized mathematical constructs to predict for such unforeseen columns or inquiries.
- the models simply cannot perform this kind of varying query, for instance, for a user exploring data making multiple distinct queries or simply changing the column or columns to be predicted as business needs and the underlying data and data structures of the client organization change over time.
- the user does not know all the columns to fix. For instance, perhaps the dataset knows a few things about one user but lots about another user. For instance, an ecommerce site may know little about a non-registered passerby user but knows lots of information about a registered user with a rich purchase history.
- the PREDICT functionality permits fixing or filling in only the stuff that is known without having to require all the data for all users, as some of the data is known to be missing. In such a way, the PREDICT functionality can still predict missing data elements with what is actually known.
- PREDICT functionality Another capability using the PREDICT functionality is to specify or fix all the data in a dataset that is known, that is, non-null, and then fill in everything else. In such a way, a user can say that what is known in the dataset is known, but much data is understood to be missing, but render predictions for the data nevertheless.
- the PREDICT operation would thus increase the population of predicted data for missing or null-values by accepting decreasing confidence, until the all or a specified population percentage of data is reached, much like the Perceptible GUI and slider examples described above.
- PREDICT Another functionality using PREDICT is to fill in an empty set. So maybe data is wholly missing, and then you start generating data that represents new rows and the new data in those rows represents plausible data, albeit synthetic data.
- PREDICT can be used to populate data elements that are not known but should be present or may be present, yet are not filled in within the data set, thus allowing the PREDICT functionality to populate such data elements.
- Another example is to use PREDICT to attain a certainty or uncertainty for any element and to display or return the range of plausible values or the element.
- SIMILAR functionality Like the RELATED functionality, an API call to Veritable for SIMILAR accepts a row and then returns what other rows are most similar to the row specified Like the RELATED examples, the SIMILAR functionality returns the probability that a row specified and any respective returned row actually exhibits similarity. For instance, rather than specifying column, you specify “Fred” as a row in the dataset. Then you ask using the SIMILAR functionality, for “Fred,” what rows are scored based on probability to be the most like “Fred.”
- the API call can return all rows scored from the dataset or return only rows above or below a specified threshold. For instance, perhaps rows above 0.8 are the most interesting or the rows below 0.2 are most interesting, or both, or a range.
- SIMILAR scores every row for the specified row and returns the rows and the score based on probably according to the user's constraints or the constraints of an implementing GUI, if any such constraints are given. Because the Veritable system figures out these relationships using its own analysis, there is more than way to evaluate for this inquiry. Thus, user must provide to an API call for SIMILAR the specified row to find and additionally the COLUMN which provides how you the user constructing the PreQL query or API call actually cares about the data. Thus, the API call requires both row and column to be fixed. In such a way, providing, specifying, or fixing the column variable provides disambiguation information to Veritable and the column indication tells the Veritable core where to enter the index. Otherwise there would be too many possible ways to score the returned rows as the Veritable core could not disambiguate how the caller cares about the information for which a similarity is sought.
- the GROUP functionality then implements a row centric operation like the SIMILAR functionality, but in contrast to an API call for SIMILAR where you must give a row and the SIMILAR call returns back a list of other rows and a score based on their probabilities of being related, with the GROUP functionality, the API call requires no row to be given or fixed whatsoever. Only a column is thus provided when making a call to the GROUP functionality.
- Calling the GROUP functionality with a specified or fixed column causes the functionality to return the groupings of the ROWS that seem to be related or correlated in some way based on Veritable's analysis.
- PreQL structure queries permits programmatic queries into the predictive database in a manner similar to a programmer making SQL queries into a relational database. Rather than a “select” statement in the query the term is replaced with the “predict” or “similar” or “related” or “group” statements.
- the above query is implemented via a specialized GUI interface which accepts inputs from a user via the GUI interface and constructs, calls, and returns data via the PREDICT functionality on behalf of the user without requiring the user actually write or even be aware of the underlying PreQL structure query made to the Veritable core.
- a specialized GUI implementation enables users to filter on a historical value by comparing a historical value versus a current value in a multi-tenant system. Filtering for historical data using a GUI's field option wherein the GUI displays current fields related to historical fields as is depicted.
- FIG. 19A depicts a specialized GUI to query using historical dates.
- Embodiments provide for the ability to filter historical data by comparing historical value versus a constant in a multi-tenant system.
- the embodiments utilize the Veritable core by calling the appropriate APIs to make queries on behalf of the GUI users.
- the GUI performs the query and then consumes the data which is then presented back to the end users via the interface.
- a sales person looking at the sales information in a particular data set.
- the interface can take the distributions provided by Veritable's core and produce a visual indication for ranking the information according to a variety of customized solutions and use cases.
- systems and methods for determining the likelihood of an opportunity to close using only closed opportunities is provided.
- SalesCloud is an industry leading CRM application currently used by 125,000 enterprise customers. Customers see the value of storing the data in the Cloud. These customers appreciate a web based interface to view and act on their data, and these customers like to use report and dashboard mechanisms provided by the cloud based service. Presenting these various GUIs as tabs enables salespeople and other end users to explore their underlying dataset in a variety of ways to learn how their business is performing in real-time. These users also rely upon partners to extend the provided cloud based service capabilities through APIs.
- a cloud based service that offers customers the opportunity to learn from the past and draw data driven insights is highly desirable as such functionality should help these customers make intelligent decisions about the future for their business based on their existing dataset.
- the customized GUIs utilize Veritable's core to implement predictive models which may vary per customer organization or be tailored to a particular organizations needs via programmatic parameters and settings exposed to the customer organization to alter the configuration and operation of Veritable's functionality.
- a GUI may be provided to compute and assign an opportunity score based on probability for a given opportunity reflecting the likelihood of that opportunity to close as a win or loss.
- the data set to compute this score would consists of all the opportunities that have been closed (either won/loss) in a given period of time, such as 1, 2, or 3 years or a lifetime of an organization, etc. Additional data elements from the customer organization's dataset may also be utilized, such as the account object as an input.
- Machine learning techniques implemented via Veritable's core such as SVN, Regression, Decision Trees, PGM, etc., are then used to build an appropriate model to render the opportunity score and then the GUI depicts the information to the end user via the interface.
- Systems and methods for determining the likelihood of an opportunity to close using historical trending data is additionally disclosed. For instance, a historical selector for picking relative or absolute dates is described.
- FIG. 19B depicts an additional view of a specialized GUI to query using historical dates.
- Systems and methods for determining the likelihood for an opportunity to close given social and marketing data is additionally disclosed.
- the dataset of the customer organization or whomever is utilizing the system is expanded on behalf of the end user beyond that which is specified and then that additionally information is utilized to further influence and educate the predictive models. For instance, certain embodiments pull information from an exemplary website such as “data.com,” and then the data is associated with each opportunity in the original data set to discover further relationships, causations, and hidden structure which can then be presented to the end user.
- Other data sources are equally feasible, such as pulling data from social networking sites, search engines, data aggregation service providers, etc.
- social data is retrieved and a sentiment is provided to the end-user via the GUI to depict how the given product is viewed by others in a social context.
- a salesperson can look at the persons linked in profile and with information from data.com or other sources the salesperson can additionally be given sentiment analysis in terms of social context for the person that the salesperson is actually trying to sell to. For instance, has the target purchaser commented about other products or have they complained about any other products, etc.
- Each of these data points and others based may help influence the model employed by Veritable's core to render a prediction.
- datasets are explored beyond the boundaries of any particular customer organization having data within the multi-tenant database system.
- benchmark predictive scores are generated based on industry specific learning using cross-organizational data stored within the multi-tenant database system. For example, data mining may be performed against telecom specific customer datasets, given their authorization or license to do so. Such cross-organization data to render a much larger multi-tenant dataset can then be analyzed via Veritable and provide insights, relationships, causations, and additional hidden structure that may not be present within a single customer organizations' dataset.
- the probability for that deal to close in 3 months may be, according to such analysis, 50% because past transactions have shown that it could take up to six months to close a $100 k telecom deal in NY-NJ-Virginia tri-city area when viewed in the context of multiple customer organizations' datasets.
- Many of the insights realized through such a process may be non-intuitive, yet capable of realization through application of the techniques described herein.
- the vendor can utilize such information to gain a better understanding of the particular regional market based on the predictions and confidence levels given.
- Provided functionality can additionally predict information for a vertical sector as well as for the region.
- a relationship may be discovered that where customers bought a, those customers also bought b.
- These kinds of matching relationships are useful, but can be further enhanced. For instance, using the predictive analysis of Veritable it is additionally possible to identify the set of factors that led to a particular opportunity score (e.g., a visualized presentation of such analysis).
- FIG. 19C depicts another view of a specialized GUI to configure predictive queries.
- the GUI presents a 42% opportunity at the user interface but when the user mousse over the opportunity score, the GUI then displays sub-detailed elements that make up that opportunity score.
- the GUI makes the necessary Veritable based API calls on behalf of the user such that an appropriate call is made to the predictive platform to pull the opportunity score and display that information to the user as well as the sub-detail relationships and causations considered relevant.
- the GUI can additionally leverage the predict and analyze capabilities of Veritable which upon calling a predict function for a given opportunity will return data necessary to create a histogram for an opportunity. So not only can the user be given a score, but the user can additionally be given the factors and guidance on how to interpret the information provided and what to do with such information.
- a feedback loop is created through which further data is input into the predictive database upon which additional predictions and analysis are carried out in an adaptive manner. For example, as the Veritable core learns more about the data the underlying models may be refreshed on a monthly basis by re-performing the analysis of the dataset so as to re-calibrate the data using the new data obtained via the feedback loop.
- FIG. 19D depicts an exemplary architecture in accordance with described embodiments.
- FIG. 19E is a flow diagram illustrating a method in accordance with disclosed embodiments.
- FIG. 20A depicts a pipeline change report in accordance with described embodiments. For example, a user can request to be shown the open pipeline for the current month by stage.
- Each stage may consist of multiple opportunities and each might be able to be duplicated because each might change according to the amount or according to the stage, etc. Thus, if a user is looking at the last four weeks, then one opportunity may change from $500 to $1500 and thus be duplicated.
- the cloud computing architecture executes functionality which runs across all the data for all tenants.
- the database maintains a history object into which all of audit data is retained such that a full and rich history can later be provided to the user at their request to show the state of any event in the past, without corrupting the current state of the data.
- a user may nevertheless utilize the system to display the state of a particular opportunity as it stood last week, or as it transitioned through past quarter, and so forth.
- All of the audit data from history objects for various categories of data is then aggregated into a historical trending entity which is a custom object that stores any kind of data. This object is then queried by the different historical report types across multiple tenants to retrieve the necessary audit trail data such that any event at any time in the past can be re-created for the sake of reporting, predictive analysis, and exploration.
- the historical audit data may additionally be subjected to the analysis capabilities of Veritable by including it within a historical dataset for the sake of providing further predictive capabilities.
- the algorithms to provide historical reporting capabilities are applied across all the tenant data which is common within the historical trending data object and the interim opportunity history and lead history, etc.
- the data can also be visualized using salesforce.com's charting engine as depicted by the waterfall diagram.
- FIG. 20B depicts a waterfall chart using predictive data in accordance with described embodiments.
- waterfall charts for historical data in a multi-tenant system are thus provided. For instance, on the x axis is the weekly snapshot for all the opportunities being worked. The amounts are changing up and down and then are also grouped by stages.
- the waterfall enables a user to look at two points in time and by defining opportunities between day one and day two.
- waterfall diagrams can be used to group all opportunities into different stages as in the example above which every opportunity is mapped according to its stage allowing a user to look into the past and understand what the timing is for these opportunities to actually come through to closure.
- Historical data and the audit history saved to the historical trending data object are enabled through snapshots and field history. Using the historical trending data object the desired data can then be queried.
- the historical trending data object may be implemented as one table with indexes on the table from which any of the desired data can then be retrieved.
- the software modules implementing the various GUIs and use cases populate the depicted table using the opportunity data retrieved from the historical trending data object's table.
- systems and methods to deliver waterfall chart for historical data in a multi-tenant system and historical trending in a multi-tenant environment are additionally described.
- time as a dimension is used to then provide a decision tree for the customers to pick either absolute date or a range of dates. Within the date customers can pick an absolute date, such as Jan. 1, 2013 or a relative date such as the first day of the current month or first day of the last month, etc.
- the user can take a step back in time, thinking back where they were a week ago, or a month ago and identify the opportunity by creating a range of dates and displaying what opportunities were created during those dates.
- a salesperson wanting such information may have had ten opportunities and on Feb. 1, 2013, the salesperson's target buyer expresses interest in a quote, so the stage changes from prospecting to quotation. Another target buyer, however, says they want to buy immediately, so the state changes from quotation to sale/charge/close.
- the functionality therefore provides a back end which implements a decision tree with the various dates that are created. The result is that the functionality can give the salesperson a view of all the opportunities that are closing in the month of January, or February, or within a given range, etc.
- the query for dates is unique because it is necessary to traverse the decision tree to get to the date the user picks and then enabling the user to additionally pick the number of snap shots, from which the finalized result set is determined, for instance, from Feb. 1 to Feb. 6, 2013.
- FIG. 20C depicts an interface with defaults after adding a first historical field.
- a historical selector a constant in a multi-tenant system
- the customer has the ability to filter on historical data using a custom historical filter.
- the interface provides the ability for the customer to look at all of the filters on the left that they can use to restrict a value or a field, thus allowing customers to filter on historical column data for any given value.
- a customer may look at all of the open opportunities for a given month or filter the data set according to current column data rather than historical.
- a user at the interface can fill out the amount, stage, close date, probability, forecast category, or other data elements and then as the salesperson speaks with the target buyer, the state is changed from prospecting to quoting, to negotiation based on the progress that is made with the target buyer, and eventually to a state of won/closed or lost, etc. So maybe the target buyer is trying to decrease the amount of the deal and the salesperson is trying to increase the amount. All of that data and state changes (e.g., a change in amount can be a state change within a given phase of the deal) and the information is stored in the historical opportunity object which provides the audit trail.
- a change in amount can be a state change within a given phase of the deal
- the GUI enables a user to go back 12 months according to one embodiment.
- Such historical data and audit data may be processed with granularity of one day, and thus, a salesperson can go back in time and view how the data has changed overtime with within the data set with the daily granular reporting.
- the historical trending entity object is used to allow the tool to pull the information about how these opportunities changed over time for the salesperson.
- metrics would be useful to other disciplines also, such as a service manager running a call center who gets 100 cases from sales agents wants to know how to close those calls, etc.
- the campaign managers will want to know how to close the various leads an opportunities as well as peer back into history to see how events influenced the results of past opportunities.
- FIG. 20D depicts in additional detail an interface with defaults for an added custom filter.
- FIG. 20E depicts another interface with defaults for an added custom filter.
- FIG. 20F depicts an exemplary architecture in accordance with described embodiments.
- FIG. 20G is a flow diagram illustrating a method in accordance with disclosed embodiments.
- the historical trending entity object is implemented via a historical data schema in which history data is stored in a new table core.
- historical_entity_data as depicted at Table 1 below:
- Indices utilized in the above Table 1 include: organization_id, key_prefix, historic_entity_data_id.
- PK includes: organization_id, key_prefix, system_modstamp. Unique, find, and snapshot for given date and parent record: organization_id, key_prefix, parent_id, valid_to, valid_from. Indices organization_id, key_prefix, valid_to facilitate data clean up.
- Such a table is additionally counted against storage requirements according to certain embodiments. Usage may be capped at 100. Alternatively, when available slots are running low, old slots may be cleaned. Historical data management, row limits, and statistics may be optionally utilized.
- Sampling of production data revealed recent grow in row count for entity history is ⁇ 2.5 B (billion) rows/year. Since historical trending will store a single row for any number of changed fields, an additional factor of 0.78 can be applied. Since historical trending will only allow 4 standard and at most 5 custom objects, additional factor of 0.87 can be used to only include top standard and custom objects contributing to entity history. With additional factor of 0.7 to only include UE and EE organizations, the expected row count for historical trending is 1.2 B row/year in the worst case scenario.
- Historical data may be stored by default for 2 years and the size of the table is expected to stay around 2.4 B rows.
- Custom value columns are to be handled by custom indexes similar to custom objects.
- each organization will have a history row limit for each object. Such a limit could be between approximately 1 and 5 million rows per object which is sufficient to cover storage of current data as well as history data based on analyzed usage patterns of production data with only very few organizations (3-5) occasionally having so many objects that they would hit the configurable limit.
- the customized table can be custom indexed to help query performance.
- High level use cases for such historical based data in a dataset to be analyzed and subjected to Veritable's predictive analysis include: Propensity to Buy and Lead Scoring for sales representatives and marketing users. For instance, sales users often get leads from multiple sources (marketing, external, sales prospecting etc.) and often times, in any given quarter, they have more leads to follow up than they have time. Sales representatives often need guidance with key questions such as: which leads have the highest propensity to buy, what is the likelihood of a sale, what is the potential revenue impact if this lead is converted to an opportunity, what is the estimated sale cycle based on historical observations if this lead is converted to an opportunity, what is the lead score for each of their leads in their pipeline so that sales representatives can discover the high potential sales leads in their territories, and so forth.
- Sales representatives may seek to determine the top ten products each account will likely buy based on the predictive analysis and the deal sizes if they successfully close, the length of the deal cycle based on the historical trends of similar accounts, and so forth. When sales representatives act on these recommendations, they can broaden their pipeline and increase their chance to meet or exceed quota, thus improving sales productivity, business processes, prospecting, and lead qualification.
- Additional use cases for such historical based data may further include: likelihood to close/win and opportunity scoring. For instance, sales representatives and sales managers may benefit from such data as they often have many few deals in their current pipeline and must juggle where to apply their time and attention in any month/quarter. As these sales professionals approach the end of the sales period, the pressure to meet their quota is of significant importance. Opportunity scoring can assist with ranking the opportunities in the pipeline based on the probability of such deals to close, thus improving the overall effectiveness of these sales professionals.
- Data sources may include such data as: Comments, sales activities logged, standard field numbers for activities (e.g., events, log a call, tasks etc.), C-level customer contacts, decision maker contacts, close dates, standard field numbers for times the close date has pushed, opportunity competitors, standard field opportunities, competitive assessments, executive sponsorship, standard field sales team versus custom field sales team as well as the members of the respective teams, chatter feed and social network data for the individuals involved, executive sponsor involved in a deal, DSRs (Deal Support Requests), and other custom fields.
- Historical based data can be useful to Veritable's predictive capabilities for generating metrics such as Next Likelihood Purchase (NLP) and opportunity whitespace for sales reps and sales managers.
- NLP Next Likelihood Purchase
- a sales rep or sales manager responsible for achieving quarterly sales targets will undoubtedly be interested in: which types of customers are buying which products; which prospects most resemble existing customers; are the right products being offered to the right customer at the right price; what more can we sell to my customer to increase the deal size, and so forth. Looking at historical data of things that similar customers have purchased to uncover selling trends, and using such metrics yields valuable insights to make predictions about what customers will buy next, thus improving sales productivity and business processes.
- Another capability provided to end users is to provide customer references on behalf of sales professionals and other interested parties.
- sales professions require customer references for potential new business leads they often spend significant time searching through and piecing together such information from CRM sources such as custom applications, intranet sites, or reference data captured in their databases.
- CRM sources such as custom applications, intranet sites, or reference data captured in their databases.
- Veritable core and associated use case GUIs can provide key information to these sales professions.
- the application can provide data including that is grouped according to industry, geography, size, similar product footprint, and so forth, as well as provide in one place what reference assets are available for those customer references, such as customer success stories, videos, best practices, which reference customers are available to chat with a potential buyer, customer reference information grouped according to the contact person's role, such as CIO, VP of sales, etc., which reference customers have been over utilized and thus may not be good candidate references at this time, who are the sales representatives or account representatives for those reference customers at the present time or at any time in the past, who is available internal to a organization to reach or make contact with the reference customer, and so forth.
- customer references such as customer success stories, videos, best practices, which reference customers are available to chat with a potential buyer
- customer reference information grouped according to the contact person's role such as CIO, VP of sales, etc.
- Veritable's analysis can identify such relationships and hidden structure in the data which may then be retrieved and displayed by specialized GUI interfaces for end-users. Additionally, the functionality can identify the most ideal or the best possible reference customer among many based on predictive analysis which builds the details of a reference customer into a probability to win/close the opportunity, which is data wholly unavailable from conventional systems.
- filter elements are provided to the user to narrow or limit the search according to desire criteria, such as industry, geography, deal size, products in play etc. Such functionality thus aids sales professionals with improving sales productivity and business processes.
- functionality is provided to predict forecast adjustments on behalf of sales professionals.
- businesses commonly have a system of sales forecasting as part of their critical management strategy.
- Such forecasts are, by nature, inexact. The difficulty is knowing which direction such forecasts are wrong and then turning that understanding into an improved picture of how the business is doing.
- Veritable's functionality can improve such forecasting using existing data of a customer organization including existing forecasting data. For instance, applying Veritable's analysis to past forecast to the business can aid in trending and with improving existing forecasts into the future which have yet to be realized. Sales managers are often asked to provide their judgment or adjustment on forecasting data for their respective sales representatives which requires such sales managers to aggregate their respective sales representatives' individual forecasts.
- Veritable's analysis function mines past forecast trends by the sales representatives for relationships and causations such as forecast versus quota versus actual for the past eight quarters or other time period, and then provides a recommended judgment and/or adjustment that should be applied to the current forecast.
- organizations can reduce the variance between individual sales representative's stipulated quotas, forecasts, and actuals, over a period of time, thereby narrowing deltas between forecast and realized sales via improved forecast accuracy.
- Additional functionality enables use case GUI interfaces to render a likelihood to renew an opportunity or probability of retention for an opportunity by providing a retention score. Such functionality is helpful to sales professionals as such metrics can influence where a salesperson's time and resources are best spent so as to maximize revenue.
- FIG. 21A provides a chart depicting prediction completeness versus accuracy.
- FIG. 21B provides a chart depicting an opportunity confidence breakdown.
- FIG. 21C provides a chart depicting an opportunity win prediction.
- FIG. 22A provides a chart depicting predictive relationships for opportunity scoring.
- FIG. 22B provides another chart depicting predictive relationships for opportunity scoring.
- FIG. 22C provides another chart depicting predictive relationships for opportunity scoring.
- Unbounded Categorical Data types model categorical columns where new values that are not found in the dataset can show up. For example, most opportunities will be replacing one of a handful of common existing systems, such as an Oracle implementation, but a new opportunity might be replacing a new system which has not been seen in the data ever before.
- FIG. 1 depicts an alternative exemplary architectural overview 300 of the environment in which embodiments may operate.
- customer organizations 305 A, 305 B, and 305 C there are depicted multiple customer organizations 305 A, 305 B, and 305 C. Obviously, there may be many more customer organizations than those depicted.
- each of the customer organizations 305 A-C includes at least one client device 306 A, 306 B, and 306 C.
- a user may be associated with such a client device, and may further initiate requests to the host organization 310 which is connected with the various customer organizations 305 A-C and client devices 306 A-C via network 325 (e.g., such as via the public Internet), thus establishing a relationship between the cloud based services provider and the customer organizations.
- network 325 e.g., such as via the public Internet
- the client devices 306 A-C each individually transmit request packets 316 to the remote host organization 310 via the network 325 .
- the host organization 310 may responsively send response packets 315 to the originating customer organization to be received via the respective client devices 306 A-C.
- Such interactions thus establish the communications necessary to transmit and receive information in fulfillment of the described embodiments on behalf of each the customer organizations and the host organization 310 providing the cloud based computing services including access to the Veritable functionality described.
- a request interface 375 which receives the packet requests 315 and other requests from the client devices 306 A-C and facilitates the return of response packets 316 .
- a PreQL query interface 380 which operates to query the predictive database 350 in fulfillment of such request packets from the client devices 306 A-C, for instance, issuing API calls for PreQL structure query terms such as “PREDICT,” “RELATED,” “SIMILAR,” and “GROUP.” Also available are the API calls for “UPLOAD” and “ANALYZE,” so as to upload new data sets or define datasets to the predictive database 350 and trigger the Veritable core 390 to instantiate analysis of such data.
- Server side application 385 may operate cooperatively with the various client devices 306 A-C.
- Veritable core 390 includes the necessary functionality to implement the embodiments described herein.
- FIG. 2 illustrates a block diagram of an example of an environment 210 in which an on-demand database service might be used.
- Environment 210 may include user systems 212 , network 214 , system 216 , processor system 217 , application platform 218 , network interface 220 , tenant data storage 222 , system data storage 224 , program code 226 , and process space 228 .
- environment 210 may not have all of the components listed and/or may have other elements instead of, or in addition to, those listed above.
- Environment 210 is an environment in which an on-demand database service exists.
- User system 212 may be any machine or system that is used by a user to access a database user system.
- any of user systems 212 can be a handheld computing device, a mobile phone, a laptop computer, a work station, and/or a network of computing devices.
- user systems 212 might interact via a network 214 with an on-demand database service, which is system 216 .
- An on-demand database service such as system 216
- system 216 is a database system that is made available to outside users that do not need to necessarily be concerned with building and/or maintaining the database system, but instead may be available for their use when the users need the database system (e.g., on the demand of the users).
- Some on-demand database services may store information from one or more tenants stored into tables of a common database image to form a multi-tenant database system (MTS).
- MTS multi-tenant database system
- “on-demand database service 216 ” and “system 216 ” is used interchangeably herein.
- a database image may include one or more database objects.
- Application platform 218 may be a framework that allows the applications of system 216 to run, such as the hardware and/or software, e.g., the operating system.
- on-demand database service 216 may include an application platform 218 that enables creation, managing and executing one or more applications developed by the provider of the on-demand database service, users accessing the on-demand database service via user systems 212 , or third party application developers accessing the on-demand database service via user systems 212 .
- the users of user systems 212 may differ in their respective capacities, and the capacity of a particular user system 212 might be entirely determined by permissions (permission levels) for the current user. For example, where a salesperson is using a particular user system 212 to interact with system 216 , that user system has the capacities allotted to that salesperson. However, while an administrator is using that user system to interact with system 216 , that user system has the capacities allotted to that administrator. In systems with a hierarchical role model, users at one permission level may have access to applications, data, and database information accessible by a lower permission level user, but may not have access to certain applications, database information, and data accessible by a user at a higher permission level. Thus, different users will have different capabilities with regard to accessing and modifying application and database information, depending on a user's security or permission level.
- Network 214 is any network or combination of networks of devices that communicate with one another.
- network 214 can be any one or any combination of a LAN (local area network), WAN (wide area network), telephone network, wireless network, point-to-point network, star network, token ring network, hub network, or other appropriate configuration.
- LAN local area network
- WAN wide area network
- telephone network wireless network
- point-to-point network star network
- token ring network token ring network
- hub network or other appropriate configuration.
- TCP/IP Transfer Control Protocol and Internet Protocol
- User systems 212 might communicate with system 216 using TCP/IP and, at a higher network level, use other common Internet protocols to communicate, such as HTTP, FTP, AFS, WAP, etc.
- user system 212 might include an HTTP client commonly referred to as a “browser” for sending and receiving HTTP messages to and from an HTTP server at system 216 .
- HTTP server might be implemented as the sole network interface between system 216 and network 214 , but other techniques might be used as well or instead.
- the interface between system 216 and network 214 includes load sharing functionality, such as round-robin HTTP request distributors to balance loads and distribute incoming HTTP requests evenly over a plurality of servers. At least as for the users that are accessing that server, each of the plurality of servers has access to the MTS' data; however, other alternative configurations may be used instead.
- system 216 implements a web-based customer relationship management (CRM) system.
- system 216 includes application servers configured to implement and execute CRM software applications as well as provide related data, code, forms, webpages and other information to and from user systems 212 and to store to, and retrieve from, a database system related data, objects, and Webpage content.
- CRM customer relationship management
- data for multiple tenants may be stored in the same physical database object, however, tenant data typically is arranged so that data of one tenant is kept logically separate from that of other tenants so that one tenant does not have access to another tenant's data, unless such data is expressly shared.
- system 216 implements applications other than, or in addition to, a CRM application.
- system 216 may provide tenant access to multiple hosted (standard and custom) applications, including a CRM application.
- User (or third party developer) applications which may or may not include CRM, may be supported by the application platform 218 , which manages creation, storage of the applications into one or more database objects and executing of the applications in a virtual machine in the process space of the system 216 .
- FIG. 2 One arrangement for elements of system 216 is shown in FIG. 2 , including a network interface 220 , application platform 218 , tenant data storage 222 for tenant data 223 , system data storage 224 for system data 225 accessible to system 216 and possibly multiple tenants, program code 226 for implementing various functions of system 216 , and a process space 228 for executing MTS system processes and tenant-specific processes, such as running applications as part of an application hosting service. Additional processes that may execute on system 216 include database indexing processes.
- each user system 212 may include a desktop personal computer, workstation, laptop, PDA, cell phone, or any wireless access protocol (WAP) enabled device or any other computing device capable of interfacing directly or indirectly to the Internet or other network connection.
- WAP wireless access protocol
- User system 212 typically runs an HTTP client, e.g., a browsing program, such as Microsoft's Internet Explorer browser, a Mozilla or Firefox browser, an Opera, or a WAP-enabled browser in the case of a smartphone, tablet, PDA or other wireless device, or the like, allowing a user (e.g., subscriber of the multi-tenant database system) of user system 212 to access, process and view information, pages and applications available to it from system 216 over network 214 .
- HTTP client e.g., a browsing program, such as Microsoft's Internet Explorer browser, a Mozilla or Firefox browser, an Opera, or a WAP-enabled browser in the case of a smartphone, tablet, PDA or other wireless device, or the like.
- Each user system 212 also typically includes one or more user interface devices, such as a keyboard, a mouse, trackball, touch pad, touch screen, pen or the like, for interacting with a graphical user interface (GUI) provided by the browser on a display (e.g., a monitor screen, LCD display, etc.) in conjunction with pages, forms, applications and other information provided by system 216 or other systems or servers.
- GUI graphical user interface
- the user interface device can be used to access data and applications hosted by system 216 , and to perform searches on stored data, and otherwise allow a user to interact with various GUI pages that may be presented to a user.
- embodiments are suitable for use with the Internet, which refers to a specific global internetwork of networks. However, it is understood that other networks can be used instead of the Internet, such as an intranet, an extranet, a virtual private network (VPN), a non-TCP/IP based network, any LAN or WAN or the like.
- VPN virtual private network
- each user system 212 and all of its components are operator configurable using applications, such as a browser, including computer code run using a central processing unit such as an Intel Pentium® processor or the like.
- system 216 (and additional instances of an MTS, where more than one is present) and all of their components might be operator configurable using application(s) including computer code to run using a central processing unit such as processor system 217 , which may include an Intel Pentium® processor or the like, and/or multiple processor units.
- each system 216 is configured to provide webpages, forms, applications, data and media content to user (client) systems 212 to support the access by user systems 212 as tenants of system 216 .
- system 216 provides security mechanisms to keep each tenant's data separate unless the data is shared.
- MTS Mobility Management Entity
- they may be located in close proximity to one another (e.g., in a server farm located in a single building or campus), or they may be distributed at locations remote from one another (e.g., one or more servers located in city A and one or more servers located in city B).
- each MTS may include one or more logically and/or physically connected servers distributed locally or across one or more geographic locations.
- server is meant to include a computer system, including processing hardware and process space(s), and an associated storage system and database application (e.g., OODBMS or RDBMS) as is well known in the art. It is understood that “server system” and “server” are often used interchangeably herein.
- database object described herein can be implemented as single databases, a distributed database, a collection of distributed databases, a database with redundant online or offline backups or other redundancies, etc., and might include a distributed database or storage network and associated processing intelligence.
- FIG. 3 illustrates a block diagram of an embodiment of elements of FIG. 2 and various possible interconnections between these elements.
- FIG. 3 also illustrates environment 210 .
- the elements of system 216 and various interconnections in an embodiment are further illustrated.
- user system 212 may include a processor system 212 A, memory system 212 B, input system 212 C, and output system 212 D.
- FIG. 3 shows network 214 and system 216 .
- system 216 may include tenant data storage 222 , tenant data 223 , system data storage 224 , system data 225 , User Interface (UI) 330 , Application Program Interface (API) 332 (e.g., a PreQL or JSON API), PL/SOQL 334 , save routines 336 , application setup mechanism 338 , applications servers 300 1 - 300 N , system process space 302 , tenant process spaces 304 , tenant management process space 310 , tenant storage area 312 , user storage 314 , and application metadata 316 .
- environment 210 may not have the same elements as those listed above and/or may have other elements instead of, or in addition to, those listed above.
- system 216 may include a network interface 220 (of FIG. 2 ) implemented as a set of HTTP application servers 300 , an application platform 218 , tenant data storage 222 , and system data storage 224 . Also shown is system process space 302 , including individual tenant process spaces 304 and a tenant management process space 310 . Each application server 300 may be configured to tenant data storage 222 and the tenant data 223 therein, and system data storage 224 and the system data 225 therein to serve requests of user systems 212 .
- the tenant data 223 might be divided into individual tenant storage areas 312 , which can be either a physical arrangement and/or a logical arrangement of data.
- user storage 314 and application metadata 316 might be similarly allocated for each user. For example, a copy of a user's most recently used (MRU) items might be stored to user storage 314 . Similarly, a copy of MRU items for an entire organization that is a tenant might be stored to tenant storage area 312 .
- a UI 330 provides a user interface and an API 332 (e.g., a PreQL or JSON API) provides an application programmer interface to system 216 resident processes to users and/or developers at user systems 212 .
- the tenant data and the system data may be stored in various databases, such as one or more OracleTM databases.
- Application platform 218 includes an application setup mechanism 338 that supports application developers' creation and management of applications, which may be saved as metadata into tenant data storage 222 by save routines 336 for execution by subscribers as one or more tenant process spaces 304 managed by tenant management process space 310 for example. Invocations to such applications may be coded using PL/SOQL 334 that provides a programming language style interface extension to API 332 (e.g., a PreQL or JSON API). Invocations to applications may be detected by one or more system processes, which manages retrieving application metadata 316 for the subscriber making the invocation and executing the metadata as an application in a virtual machine.
- API 332 e.g., a PreQL or JSON API
- Each application server 300 may be communicably coupled to database systems, e.g., having access to system data 225 and tenant data 223 , via a different network connection.
- one application server 300 1 might be coupled via the network 214 (e.g., the Internet)
- another application server 300 N-1 might be coupled via a direct network link
- another application server 300 N might be coupled by yet a different network connection.
- Transfer Control Protocol and Internet Protocol TCP/IP
- TCP/IP Transfer Control Protocol and Internet Protocol
- each application server 300 is configured to handle requests for any user associated with any organization that is a tenant. Because it is desirable to be able to add and remove application servers from the server pool at any time for any reason, there is preferably no server affinity for a user and/or organization to a specific application server 300 .
- an interface system implementing a load balancing function e.g., an F5 Big-IP load balancer
- the load balancer uses a least connections algorithm to route user requests to the application servers 300 .
- Other examples of load balancing algorithms such as round robin and observed response time, also can be used.
- system 216 is multi-tenant, in which system 216 handles storage of, and access to, different objects, data and applications across disparate users and organizations.
- one tenant might be a company that employs a sales force where each salesperson uses system 216 to manage their sales process.
- a user might maintain contact data, leads data, customer follow-up data, performance data, goals and progress data, etc., all applicable to that user's personal sales process (e.g., in tenant data storage 222 ).
- tenant data storage 222 since all of the data and the applications to access, view, modify, report, transmit, calculate, etc., can be maintained and accessed by a user system having nothing more than network access, the user can manage his or her sales efforts and cycles from any of many different user systems. For example, if a salesperson is visiting a customer and the customer has Internet access in their lobby, the salesperson can obtain critical updates as to that customer while waiting for the customer to arrive in the lobby.
- user systems 212 (which may be client systems) communicate with application servers 300 to request and update system-level and tenant-level data from system 216 that may require sending one or more queries to tenant data storage 222 and/or system data storage 224 .
- System 216 e.g., an application server 300 in system 216
- System data storage 224 may generate query plans to access the requested data from the database.
- Each database can generally be viewed as a collection of objects, such as a set of logical tables, containing data fitted into predefined categories.
- a “table” is one representation of a data object, and may be used herein to simplify the conceptual description of objects and custom objects as described herein. It is understood that “table” and “object” may be used interchangeably herein.
- Each table generally contains one or more data categories logically arranged as columns or fields in a viewable schema. Each row or record of a table contains an instance of data for each category defined by the fields.
- a CRM database may include a table that describes a customer with fields for basic contact information such as name, address, phone number, fax number, etc.
- Another table might describe a purchase order, including fields for information such as customer, product, sale price, date, etc.
- standard entity tables might be provided for use by all tenants.
- such standard entities might include tables for Account, Contact, Lead, and Opportunity data, each containing pre-defined fields. It is understood that the word “entity” may also be used interchangeably herein with “object” and “table.”
- tenants may be allowed to create and store custom objects, or they may be allowed to customize standard entities or objects, for example by creating custom fields for standard objects, including custom index fields.
- custom entity data rows are stored in a single multi-tenant physical table, which may contain multiple logical tables per organization. It is transparent to customers that their multiple “tables” are in fact stored in one large table or that their data may be stored in the same table as the data of other customers.
- FIG. 4 illustrates a diagrammatic representation of a machine 400 in the exemplary form of a computer system, in accordance with one embodiment, within which a set of instructions, for causing the machine/computer system 400 to perform any one or more of the methodologies discussed herein, may be executed.
- the machine may be connected (e.g., networked) to other machines in a Local Area Network (LAN), an intranet, an extranet, or the public Internet.
- the machine may operate in the capacity of a server or a client machine in a client-server network environment, as a peer machine in a peer-to-peer (or distributed) network environment, as a server or series of servers within an on-demand service environment.
- Certain embodiments of the machine may be in the form of a personal computer (PC), a tablet PC, a set-top box (STB), a Personal Digital Assistant (PDA), a cellular telephone, a web appliance, a server, a network router, switch or bridge, computing system, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine.
- PC personal computer
- PDA Personal Digital Assistant
- a cellular telephone a web appliance
- server a network router, switch or bridge, computing system
- machine shall also be taken to include any collection of machines (e.g., computers) that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein.
- the exemplary computer system 400 includes a processor 402 , a main memory 404 (e.g., read-only memory (ROM), flash memory, dynamic random access memory (DRAM) such as synchronous DRAM (SDRAM) or Rambus DRAM (RDRAM), etc., static memory such as flash memory, static random access memory (SRAM), volatile but high-data rate RAM, etc.), and a secondary memory 418 (e.g., a persistent storage device including hard disk drives and a persistent database and/or a multi-tenant database implementation), which communicate with each other via a bus 430 .
- Main memory 404 includes stored indices 424 , an analysis engine 423 , and a PreQL API 425 .
- Main memory 404 and its sub-elements are operable in conjunction with processing logic 426 and processor 402 to perform the methodologies discussed herein.
- the computer system 400 may additionally or alternatively embody the server side elements as described above.
- Processor 402 represents one or more general-purpose processing devices such as a microprocessor, central processing unit, or the like. More particularly, the processor 402 may be a complex instruction set computing (CISC) microprocessor, reduced instruction set computing (RISC) microprocessor, very long instruction word (VLIW) microprocessor, processor implementing other instruction sets, or processors implementing a combination of instruction sets. Processor 402 may also be one or more special-purpose processing devices such as an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), a digital signal processor (DSP), network processor, or the like. Processor 402 is configured to execute the processing logic 426 for performing the operations and functionality which is discussed herein.
- CISC complex instruction set computing
- RISC reduced instruction set computing
- VLIW very long instruction word
- Processor 402 may also be one or more special-purpose processing devices such as an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), a digital signal processor (DSP), network processor,
- the computer system 400 may further include a network interface card 408 .
- the computer system 400 also may include a user interface 410 (such as a video display unit, a liquid crystal display (LCD), or a cathode ray tube (CRT)), an alphanumeric input device 412 (e.g., a keyboard), a cursor control device 414 (e.g., a mouse), and a signal generation device 416 (e.g., an integrated speaker).
- the computer system 400 may further include peripheral device 436 (e.g., wireless or wired communication devices, memory devices, storage devices, audio processing devices, video processing devices, etc.).
- the secondary memory 418 may include a non-transitory machine-readable or computer readable storage medium 431 on which is stored one or more sets of instructions (e.g., software 422 ) embodying any one or more of the methodologies or functions described herein.
- the software 422 may also reside, completely or at least partially, within the main memory 404 and/or within the processor 402 during execution thereof by the computer system 400 , the main memory 404 and the processor 402 also constituting machine-readable storage media.
- the software 422 may further be transmitted or received over a network 420 via the network interface card 408 .
- FIG. 5A depicts a tablet computing device and a hand-held smartphone each having a circuitry integrated therein as described in accordance with the embodiments.
- FIG. 5B is a block diagram of an embodiment of tablet computing device, a smart phone, or other mobile device in which touchscreen interface connectors are used.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- Business, Economics & Management (AREA)
- Software Systems (AREA)
- Finance (AREA)
- Development Economics (AREA)
- Accounting & Taxation (AREA)
- Strategic Management (AREA)
- Mathematical Physics (AREA)
- Entrepreneurship & Innovation (AREA)
- Probability & Statistics with Applications (AREA)
- Fuzzy Systems (AREA)
- General Business, Economics & Management (AREA)
- Game Theory and Decision Science (AREA)
- Economics (AREA)
- Marketing (AREA)
- Artificial Intelligence (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Optimization (AREA)
- Mathematical Analysis (AREA)
- Computational Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Human Computer Interaction (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Algebra (AREA)
- Operations Research (AREA)
- General Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
Abstract
Disclosed herein are systems and methods for predictive query implementation and usage in a multi-tenant database system including means for implementing predictive population of null values with confidence scoring, means for predictive scoring and reporting of business opportunities with probability to close scoring, and other related embodiments.
Description
- This application is related to, and claims priority to, the provisional utility application entitled “SYSTEMS AND METHODS FOR PREDICTIVE QUERY IMPLEMENTATION AND USAGE IN A MULTI-TENANT DATABASE SYSTEM,” filed on Mar. 13, 2013, having an application number of 61/780,503 and attorney docket No. 8956P119Z (520PROV), the entire contents of which are incorporated herein by reference.
- A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
- Embodiments of the invention relate generally to the field of computing, and more particularly, to systems and methods for predictive query implementation and usage in a multi-tenant database system including means for implementing predictive population of null values with confidence scoring, means for predictive scoring and reporting of business opportunities with probability to close scoring, and other related embodiments.
- Embodiments are illustrated by way of example, and not by way of limitation, and will be more fully understood with reference to the following detailed description when considered in connection with the figures in which:
-
FIG. 1 depicts an exemplary architecture in accordance with described embodiments; -
FIG. 2 illustrates a block diagram of an example of an environment in which an on-demand database service might be used; -
FIG. 3 illustrates a block diagram of an embodiment of elements ofFIG. 2 and various possible interconnections between these elements; -
FIG. 4 illustrates a diagrammatic representation of a machine in the exemplary form of a computer system, in accordance with one embodiment; -
FIG. 5A depicts a tablet computing device and a hand-held smartphone each having a circuitry integrated therein as described in accordance with the embodiments; -
FIG. 5B is a block diagram of an embodiment of tablet computing device, a smart phone, or other mobile device in which touchscreen interface connectors are used; -
FIG. 6 depicts a simplified flow for probabilistic modeling; -
FIG. 7 illustrates an exemplary landscape upon which a random walk may be performed; -
FIG. 8 depicts an exemplary tabular dataset; -
FIG. 9 depicts means for deriving motivation or causal relationships between observed data; -
FIG. 10A depicts an exemplary cross-categorization in still further detail; -
FIG. 10B depicts an assessment of convergence, showing inferred versus ground truth; -
FIG. 11 depicts a chart and graph of the Bell number series; -
FIG. 12A depicts an exemplary cross categorization of a small tabular dataset; -
FIG. 12B depicts an exemplary architecture having implemented data upload, processing, and predictive query API exposure in accordance with described embodiments; -
FIG. 12C is a flow diagram illustrating a method for implementing data upload, processing, and predictive query API exposure in accordance with disclosed embodiments; -
FIG. 12D depicts an exemplary architecture having implemented predictive query interface as a cloud service in accordance with described embodiments; -
FIG. 12E is a flow diagram illustrating a method for implementing predictive query interface as a cloud service in accordance with disclosed embodiments; -
FIG. 13A illustrates usage of the RELATED command term in accordance with the described embodiments; -
FIG. 13B depicts an exemplary architecture in accordance with described embodiments; -
FIG. 13C is a flow diagram illustrating a method in accordance with disclosed embodiments; -
FIG. 14A illustrates usage of the GROUP command term in accordance with the described embodiments; -
FIG. 14B depicts an exemplary architecture in accordance with described embodiments; -
FIG. 14C is a flow diagram illustrating a method in accordance with disclosed embodiments; -
FIG. 15A illustrates usage of the SIMILAR command term in accordance with the described embodiments; -
FIG. 15B depicts an exemplary architecture in accordance with described embodiments; -
FIG. 15C is a flow diagram illustrating a method in accordance with disclosed embodiments; -
FIG. 16A illustrates usage of the PREDICT command term in accordance with the described embodiments; -
FIG. 16B illustrates usage of the PREDICT command term in accordance with the described embodiments; -
FIG. 16C illustrates usage of the PREDICT command term in accordance with the described embodiments; -
FIG. 16D depicts an exemplary architecture in accordance with described embodiments; -
FIG. 16E is a flow diagram illustrating a method in accordance with disclosed embodiments; -
FIG. 16F depicts an exemplary architecture in accordance with described embodiments; -
FIG. 16G is a flow diagram illustrating a method in accordance with disclosed embodiments; -
FIG. 17A depicts a Graphical User Interface (GUI) to display and manipulate a tabular dataset having missing values by exploiting a PREDICT command term; -
FIG. 17B depicts another view of the Graphical User Interface; -
FIG. 17C depicts another view of the Graphical User Interface; -
FIG. 17D depicts an exemplary architecture in accordance with described embodiments; -
FIG. 17E is a flow diagram illustrating a method in accordance with disclosed embodiments; -
FIG. 18 depicts feature moves and entity moves within indices generated from analysis of tabular datasets; -
FIG. 19A depicts a specialized GUI to query using historical dates; -
FIG. 19B depicts an additional view of a specialized GUI to query using historical dates; -
FIG. 19C depicts another view of a specialized GUI to configure predictive queries; -
FIG. 19D depicts an exemplary architecture in accordance with described embodiments; -
FIG. 19E is a flow diagram illustrating a method in accordance with disclosed embodiments; -
FIG. 20A depicts a pipeline change report in accordance with described embodiments; -
FIG. 20B depicts a waterfall chart using predictive data in accordance with described embodiments; -
FIG. 20C depicts an interface with defaults after adding a first historical field; -
FIG. 20D depicts in additional detail an interface with defaults for an added custom filter; -
FIG. 20E depicts another interface with defaults for an added custom filter; -
FIG. 20F depicts an exemplary architecture in accordance with described embodiments; -
FIG. 20G is a flow diagram illustrating a method in accordance with disclosed embodiments; -
FIG. 21A provides a chart depicting prediction completeness versus accuracy; -
FIG. 21B provides a chart depicting an opportunity confidence breakdown; -
FIG. 21C provides a chart depicting an opportunity win prediction; -
FIG. 22A provides a chart depicting predictive relationships for opportunity scoring; -
FIG. 22B provides another chart depicting predictive relationships for opportunity scoring; and -
FIG. 22C provides another chart depicting predictive relationships for opportunity scoring. - The subject matter discussed in the background section should not be assumed to be prior art merely as a result of its mention in the background section. Similarly, a problem mentioned in the background section or associated with the subject matter of the background section should not be assumed to have been previously recognized in the prior art. The subject matter in the background section merely represents different approaches, which in and of themselves may also correspond to embodiments of the claimed inventions.
- Client organizations with datasets in their databases can benefit from predictive analysis. Unfortunately, there is no low cost and scalable solution in the marketplace today. Instead, client organizations must hire technical experts to develop customized mathematical constructs and predictive models which are very expensive. Consequently, client organizations without vast financial means are simply priced out of the market and thus do not have access to predictive analysis capabilities for their datasets.
- The present state of the art may therefore benefit from methods, systems, and apparatuses for predictive query implementation and usage in a multi-tenant database system as described herein.
- Users wanting to perform predictive analytics and data mining against their datasets must normally hire technical experts and explain the problem they wish to solve and then turn their data over to the hired experts to apply specialized mathematical constructs in an attempt to solve the problem.
- By analogy, many years ago when you designed a computer system it was necessary to also figure out how to put data on a physical disk. Now programmers do not concern themselves with such issues. Similarly, it is highly desirable to utilize a server and sophisticated database technology to perform data analytics for ordinary users without having to hire specialized experts. By doing so, resources could be freed up to focus on other problems.
- Some machine learning capabilities exist today. For instance, present capabilities can answer questions such as, “Is this person going to buy product x?” But such simplistic technology is not sufficient for helping people to solve more complex problems. For instance, Kaiser Healthcare corporation with vast financial resources may be able to hire experts from KXEN to develop customized analytics to solve a Kaiser specific problem based on Kaiser's database, but a small company by contrast simply cannot afford to utilize KXEN's services as the cost far outweighs a small company's financial resources to do so. Thus, our exemplary small company would be forced to simply forgo solving the problem at hand.
- Consider KXEN's own value proposition from their home page which states: “As a business analyst, you don't want to worry about complicated math or which algorithm to use. You need a model that is going to predict possible business outcomes. Is this customer likely to churn? Will they respond to a cross-sell or up-sell offer? . . . . We'll help you quickly get to the right algorithm for your business problem with a model built for accuracy and optimal results.”
- If a small company lacks the financial resources to hire a company such as KXEN and lacks the technical know how to develop the “complicated math or [select] which algorithm to use,” then such company must go without.
- Further still, the services offered today by technical experts in the field of analytics and predictive modeling provide solutions that are customized to the particular dataset of the customer. They do not offer capabilities that may be used by non-experts in an agnostic manner that is not anchored to a particular underlying dataset.
- Veritable offers a predictive database and additional commands and verbs so that a non-expert user can query the predictive database with inquiries such as: “predict revenue from users where age is greater than 35.”
- Further still, companies that hire analytics and predictive modeling are given a solution at adheres to the present data in their database, but do not adapt to changes in the data or the database structure over time. Thus, a large company may hire KXEN and math experts come in and stare at the data and build models, and so forth, and the models do work, but when the nature of the data changes over time within the database and layout of the data or the database's structure changes over time, as is normal and common for businesses, then the models stop working as they were customized for the particular data and database structure at a given point in time.
- Because Veritable provides a predictive database that is not anchored to any particular underlying dataset, it remains useful as data and data structures change over time. For instance, data analysis performed by the Veritable core may simply be re-applied to a changed dataset. There is no need to re-hire experts or re-tool the models.
- With respect salesforce.com specifically, the company offers cloud services to clients, organizations, and end users, and behind those cloud services is a multi-tenant database system which permits users to have customized data, customized field types, and so forth. The underlying data and data structures are customized by the client organizing for their own particular needs. Veritable may nevertheless be utilized on these varying datasets and data structures because it is not anchored to a particular underlying database scheme, structure, or content.
- Customer organizations further benefit from the low cost of access. For instance, the cloud service provider may elect to provide the capability as part of an overall service offering at no additional cost, or may elect to provide the additional capabilities for an additional service fee. Regardless, because the Veritable capabilities are systematically integrated into the cloud service's computing architecture and do not require experts to custom tailor a solution to each particular client organizations' dataset and structure, the scalability brings massive cost savings, thus enabling our exemplary small company with limited financial resources to go from a 0% capability because they cannot afford to hire technical experts from KXEN to, for instance, a 95% accuracy capability using Veritable. Even if a large company with sufficient financial resources could feasibly hire KXEN to develop customized mathematical constructs and models, they would need to evaluate the ROI of hiring KXEN which may be able to get, for instance, 97% accuracy through customization but at high cost, versus using the turn-key access to the low cost cloud computing service which yields the exemplary 95% accuracy.
- Regardless of the decision for a large company with financial means, a small company which would otherwise not have access to predictive analytic capabilities can benefit greatly as their capability for predictive analysis accuracy goes from 0% (e.g., mere guessing) to the exemplary 95% using the scalable architecture provided by Veritable.
- In the following description, numerous specific details are set forth such as examples of specific systems, languages, components, etc., in order to provide a thorough understanding of the various embodiments. It will be apparent, however, to one skilled in the art that these specific details need not be employed to practice the embodiments disclosed herein. In other instances, well known materials or methods have not been described in detail in order to avoid unnecessarily obscuring the disclosed embodiments.
- In addition to various hardware components depicted in the figures and described herein, embodiments further include various operations which are described below. The operations described in accordance with such embodiments may be performed by hardware components or may be embodied in machine-executable instructions, which may be used to cause a general-purpose or special-purpose processor programmed with the instructions to perform the operations. Alternatively, the operations may be performed by a combination of hardware and software.
- Embodiments also relate to an apparatus for performing the operations disclosed herein. This apparatus may be specially constructed for the required purposes, or it may be a general purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, and magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, or any type of media suitable for storing electronic instructions, each coupled to a computer system bus.
- The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various general purpose systems may be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform the required method steps. The required structure for a variety of these systems will appear as set forth in the description below. In addition, embodiments are not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the embodiments as described herein.
- Embodiments may be provided as a computer program product, or software, that may include a machine-readable medium having stored thereon instructions, which may be used to program a computer system (or other electronic devices) to perform a process according to the disclosed embodiments. A machine-readable medium includes any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computer). For example, a machine-readable (e.g., computer-readable) medium includes a machine (e.g., a computer) readable storage medium (e.g., read only memory (“ROM”), random access memory (“RAM”), magnetic disk storage media, optical storage media, flash memory devices, etc.), a machine (e.g., computer) readable transmission medium (electrical, optical, acoustical), etc.
- Any of the disclosed embodiments may be used alone or together with one another in any combination. Although various embodiments may have been partially motivated by deficiencies with conventional techniques and approaches, some of which are described or alluded to within the specification, the embodiments need not necessarily address or solve any of these deficiencies, but rather, may address only some of the deficiencies, address none of the deficiencies, or be directed toward different deficiencies and problems where are not directly discussed.
- In one embodiment, means for predictive query implementation and usage in a multi-tenant database system execute at an application in a computing device, a computing system, or a computing architecture, in which the application is enabled to communicate with a remote computing device over a public Internet, such as remote clients, thus establishing a cloud based computing service in which the clients utilize the functionality of the remote application which implements the predictive query and usage capabilities.
- Model-based clustering techniques, including inference in Dirichlet process mixture models, have difficulty when different dimensions are best explained by very different clusterings. Based on MCMC inference in a novel nonparametric Bayesian model, methods automatically discover the number of independent nonparametric Bayesian models needed to explain the data, using a separate Dirichlet process mixture model for each group in an inferred partition of the dimensions. Unlike a DP mixture, the disclosed model is exchangeable over both the rows of a heterogeneous data array (the samples) and the columns (new dimensions), and can model any dataset as the number of samples and dimensions both go to infinity. Efficiency and robustness is improved through use of algorithms described herein which in certain instances require no preprocessing to identify veridical causal structure in provided raw datasets.
- Clustering techniques are widely used in data analysis for problems of segmentation in industry, exploratory analysis in science, and as a preprocessing step to improve performance of further processing in distributed computing and in data compression. However, as datasets grow larger and noisier, the assumption that a single clustering or distribution over clusterings can account for all the variability in the observations becomes less realistic if not wholly infeasible.
- From a machine learning perspective, this is an unsupervised version of the feature selection problem: different subsets of measurements should, in general, induce different natural clusterings of the data. From a cognitive science and artificial intelligence perspective, this issue is reflected in work that seeks multiple representations of data instead of a single monolithic representation.
- As a limiting case, a robust clustering method should be able to ignore an infinite number of uniformly random or perfectly deterministic measurements. The assumption that a single nonparametric model must explain all the dimensions is partly responsible for the accuracy issues Dirichlet process mixtures often encounter in high dimensional settings. DP mixture based classifiers via class conditional density estimation highlight the problem. For instance, while a discriminative classifier can assign low weight to noisy or deterministic and therefore irrelevant dimensions, a generative model must explain them. If there are enough irrelevancies, it ignores the dimensions relevant to classification in the process. Combined with slow MCMC convergence, these difficulties have inhibited the use of nonparametric Bayesian methods in many applications.
- To overcome these limitations, cross-categorization is utilized, which is an unsupervised learning technique for clustering based on MCMC inference in a novel nested nonparametric Bayesian model. This model can be viewed as a Dirichlet process mixture, over the dimensions or columns, of Dirichlet process mixture models over sampled data points or rows. Conditioned on a partition of the dimensions, our model reduces to an independent product of DP mixtures, but the partition of the dimensions, and therefore the number and domain of independent nonparametric Bayesian models, is also inferred from the data.
- Standard feature selection results in the case where the partition of dimensions has only two groups. The described model utilizes MCMC because both model selection and deterministic approximations seem intractable due to the combinatorial explosion of latent variables, with changing numbers of latent variables as the partition of the dimensions changes.
- The hypothesis space captured by the described model is super-exponentially larger than that of a Dirichlet process mixture, with a very different structure than a Hierarchical Dirichlet Process. A generative process, viewed as a model for heterogeneous data arrays with N rows, D columns of fixed type and values missing at random, can be described as follows:
- 1. For each dimension dεD:
-
- (a) Generate hyperparameters {right arrow over (λ)}d from an appropriate hyper-prior.
- (b) Generate the model assignment zd for dimension d from a Chinese restaurant process with hyperparameter α (with α from a vague hyperprior).
2. For each group g in the dimension partition {zd}: - (a) For each sampled datapoint (or row) rεR, generate a cluster assignment zr 9 from a Chinese restaurant process with hyperparameter αg (with αg from a vague hyperprior).
- (b) For each cluster c in the row partition for this group of dimensions {zd g}:
- i. For each dimension d, generate component model parameters {right arrow over (θ)}c d from an appropriate prior and {right arrow over (λ)}d.
- ii. For each data cell x(r,d) in this component zr x
d =c for dεD), generate its value from an appropriate likelihood and {right arrow over (θ)}c d.
- The model encodes a very different inductive bias than the IBP, discovering independent systems of categories over heterogeneous data vectors, as opposed to features that are typically additively combined. It is also instructive to contrast the asymptotic capacity of our model with that of a Dirichlet process mixture. The DP mixture has arbitrarily large asymptotic capacity as the number of samples goes to infinity. Put differently, it can model any distribution over finite dimensional vectors given enough data. However, if the number of dimensions (or features) is taken to infinity, it is no longer asymptotically consistent: if we generate a sequence of datasets by sampling the first K1 dimensions from a mixture and then append K2>>K1 dimensions that are constant valued (e.g. the price of tea in China), it will eventually be forced to model only those dimensions, ignoring the statistical structure in the first K1. In contrast, the model has asymptotic capacity both in terms of the number of samples and the number of dimensions, and is infinitely exchangeable with respect to both quantities.
- As a consequence, it is self-consistent over the subset of variables measured, and can thus enjoy considerable robustness in the face of noisy, missing, and irrelevant measurements or confounding statistical signals. This should be especially helpful in demographic settings and in high-throughput biology, where noisy, or coherently co-varying but orthogonal, measurements are the norm, and each data vector arises from multiple, independent generative processes in the world.
- The algorithm builds upon a general-purpose MCMC algorithm for probabilistic programs and specializes three of the kernels. It scales linearly per iteration in the number of rows and columns and includes inference over all hyperparameters.
-
FIG. 10B depicts an assessment of convergence, showing inferred versus ground truth. With reference toFIG. 10B , an assessment of convergence, showing inferred versus ground truth joint score for ↑1000 MCMC runs (200 iterations each) with varying dataset sizes (up to 512 by 512, requiring ↑1-10 minutes each) and true dimension groups. A strong majority of points fall near the ground truth dashed line, indicating reasonable convergence; perfect linearity is not expected, partly due to posterior uncertainty. - A preliminary comparison of the learning curves for cross-categorization and one-versus-all SVMs on synthetic 5-class classification, averaged over datasets generated from 10 dimensional Bernoulli mixtures.
- The detailed mechanisms by which the latent variables introduced by this method above those in a regular DP mixture improve mixing performance. Massively parallel implementations exploit the conditional independencies in the described model. Because the described method is essentially parameter free (e.g. with improper uniform hyperpriors), robust to noisy and/or irrelevant measurements generated by multiple interacting causes, and supports arbitrarily sparsely observed, heterogeneous data, it may be broadly applicable in exploratory data analysis. Additionally, the performance of our MCMC algorithm suggests that the described approach to nesting latent variable models in a Dirichlet process over dimensions may be applied to generate robust, rapidly converging, cross-cutting variants of a wide variety of nonparametric Bayesian techniques.
- A co-assignment matrix for dimensions, where:
-
C ij =Pr[z i =z j] - That is, the probability that dimensions i and j share a common cause and therefore are modeled by the same Dirichlet process mixture. Labels show the consensus dimension groups (probability >0:75). These reflect attributes that share a common cause and thus co-vary, while the remainder of the matrix captures correlations between these discovered causes, for instance, mammals rarely have feathers or fly, ungulates are not predators, and so forth. Each dimension group picks out a different cross-cutting categorization of the rows (e.g. vertebrates, birds, canines, . . . ; not shown).
- An example may include data for 4273 hospitals by 74 variables, including quality scores and various spending measurements. The data is analyzed (˜1 hour for convergence) with no preprocessing or missing data imputation.
- Each box contains one consensus dimension group and the number of categories according to that group. In accordance with custom statistical analyses, no causal dependence between quality of care, hospital capacity, and spending is found, though each kind of measurement results in a different clustering of the hospitals. Also recovered is the cost structure of modern hospitals (e.g. increased long term care causes increased ambulance costs, likely due to an increase in at-home mishaps). Standard clustering methods miss most of this type cross-cutting structure.
- Veritable and associated Veritable APIs make use of a predictive database that finds the causes behind data and uses these causes to predict and explain the future in a highly automated fashion heretofore unavailable, thus allowing any developer to carry out scientific inquires against a dataset without requiring custom programming and consultation with mathematicians and other such experts.
- Veritable works by searching through the massive hypothesis space of all possible relationships present in a dataset, using an advanced Bayesian machine learning algorithm. The described Veritable technologies offer developers: state of the art inference performance and predictive accuracy on a very wide range of real-world datasets, with no manual parameter tuning whatsoever; scalability to very large datasets, including very high-dimensional data with hundreds of thousands or millions of columns; completely flexible predictions (e.g., predict the value of any subset of columns, given values for any other subset) without any retraining or adjustment; and quantification of the uncertainty associated with its predictions, since the system is built around a fully Bayesian probability model.
- Described applications built on top of Veritable range from predicting heart disease, to understanding health care expenditures, to assessing business opportunities and scoring a likelihood to successfully “close” such business opportunities (e.g., to successfully commensurate a sale, contract, etc.).
- Consider the problems with real-world data There's different kinds of data mixed together. Data is entered and maintained by people with other things on their minds and real work to get done. Real-world data contains errors and blanks, such as null values in places where populated values are appropriate. Users of the data may be measuring the wrong thing, or may be measuring the same thing in ten different ways. And in many organizations, no one knows precisely what data is contained in the database. Perhaps there was never a DBA (Data Base Administrator) for the organization, or the DBA left, and ten years of sedimentary layers of data has since built up. All of these are very realistic and common problems with “real-world” data found in production databases for various organizations, in contrast to pristine and small datasets that may be found in a laboratory or test setting.
- A system is needed that can make sense of data as it exists in real businesses and does not require an pristine dataset or conform to typical platonic ideals of what data should look like. A system is needed that can be queried for many different questions about many different variables, in real time. A system is needed which is capable of getting at the hidden structures in such data, that is, which variables matter and what are the segments or groups within the data. At the same time, the system must be trustworthy, that is, it can't lie to the users by providing erroneous relationships and predictions. Such a system shouldn't reveal things that aren't true and shouldn't report ghost patterns may exist in a first dataset, but won't hold up overall. Such desirable characteristics are exceedingly difficult to attain with customized statistical analysis and customized predictive modeling, and wholly unheard of in automated systems.
- According to the described embodiments, the resulting database appears to its users much like a traditional database. But instead of selecting columns from existing rows, users may issue predictive query requests via a structured query language. Such a structured language, rather than SQL may be referred to as Predictive Query Language (“PreQL”). PreQL is not to be confused with PQL which is short for the “Program Query Language.”
- PreQL is thus used to issue queries against the database to predict values. Such a PreQL query offers the same flexibility as SQL-style queries. When exploring structure, users may issue PreQL queries seeking notions of similarity that are hidden or latent in the overall data without advanced knowledge of what those similarities may be. When used in a multi-tenant database system against a massive cloud based database and its dataset, such features are potentially transformative in the computing arts.
- According to certain embodiments, Veritable utilizes a specially customized probabilistic model based upon foundational CrossCat modeling. CrossCat is a good start but could nevertheless be improved. For instance, it was not possible to run equations with CrossCat, which is solved via the core of Veritable which uses a particular engine implementation enabling such equation execution. Additionally, prior models matched data with the model to understand hidden structure, like building a probabilistic index, but was so complex that its users literally required advanced mathematics and probability theory understanding simply to implement the models for any given dataset, rendering mere mortals incapable of realistically using such models. Veritable implementations described herein provide a service which includes distributed processing, job scheduling, persistence, check-pointing, and a user-friendly API or front-end interface which accepts users' questions and queries via the PreQL query structure. Other specialized front end GUIs and interfaces are additionally described to solve for particular use cases on behalf of users and provide other simple interfaces to complex problems of probability.
- What is probability? There are many perspectives, but probability may be described as a statement, by an observer, about a degree of belief in an event, past, present, or future. Timing doesn't matter
- What is uncertainty? An observer, as noted above, doesn't know for sure whether an event will occur, notwithstanding the degree of belief in such an event having occurred, occurring, or to occur in the future.
- Probabilities are assigned relative to knowledge or information context. Different observers can have different knowledge, and assign different probabilities to same event, or assign different probabilities even when both observers have the same knowledge. Probability, as used herein, is a number between “0” (zero) and “1” (one), in which 0 means the event is sure to not occur on one extreme of a continuum and where 1 means the event is sure to occur on the other extreme of the same continuum. Both extremes are uninteresting because they represent a complete absence of uncertainty.
- A probability ties belief to one event. A probability distribution ties beliefs to every possible event, or at least, every event we want to consider. Choosing the outcome space is an important modeling decision. Summed over all outcomes in space is total probability which must be a total of “1,” that is to say, one of the outcomes must occur. Probability distributions are convenient mathematical forms that help summarize the system's beliefs in the various probabilities, but choosing a standard distribution is a modeling choice in which all models are wrong, but some are useful.
- Consider for example, a Poisson distribution which is a good model when some event can occur 0 or more times in a span of time. The outcome space is the number of times the event occurs. The Poisson distribution has a single parameter, which is the rate−the average number of times. Its mathematical form has some nice properties: Defined for all the non-negative integers Sums to 1.
- Many examples exist besides the Poisson distribution. Each such standard distribution encompasses a certain set of assumptions, such as a particular outcome space, a particular way of assigning probabilities to outcomes, etc. If you work with them, you'll start to understand why some are nice and some are frustrating if not outright evil.
- Veritable utilizes distributions which move beyond the standard distributions with specially customized modeling thus allowing for a more complex outcome space and further allowing for more complex ways of assigning probabilities to outcomes. Depicted at slide 8B above is what's called a mixture distribution combining a bunch of simpler distributions to form a more complex one. A mixture of Gaussians to model any distribution may be employed, while still assigning probabilities to outcomes, yielding a more involved mathematical relationship.
- With more complex outcome spaces, a Mondrian process defines a distribution on k-dimensional trees, providing means for dividing up a square or a cube. The outcome space is all possible trees and resulting divisions look like the famous painting. The outcome space is more structured than what is offered by the standard distributions. CrossCat does not use the Mondrian process, but it does use a structured outcome space. Veritable utilizes the Mondrian process in select embodiments.
- At a high level, probability theory is a generalization of logic. Just like computers can use logic to reason deductively, probability lets computer reason inductively, generalize, categorize, etc. Probability gives us a way to combine different sources of information in a systematic manner, that is, utilizing automated computer implemented functionality, even when that information is vague, or uncertain, or ambiguous.
-
FIG. 6 depicts a simplified flow for probabilistic modeling. Modeling is a series of choices and assumptions. For instance, it is possible to trade off fidelity and detail with tractability. Assumptions define an outcome space. Such an outcome space may be considered hypotheses, and in the modeling view, one of these possible hypotheses actually occurs. This is the hidden structure, and it is this hidden structure that generates the data. The hidden structure and the resulting generated data may be considered the generative view. For learning or inference perspectives sources of information about the hidden structure may include certain modeling assumptions (“prior”), as well as data observed (“likelihood”), from which a combination of prior and likelihood may be utilized to draw conclusions (“posterior”). - Such assumptions don't just give us a hypothesis space, they also give us a way of assigning probabilities to them, yielding a probability distribution on hypotheses, given the actual data observed.
- There can be a great many hypotheses and finding the best ones to explain the data is not a straight forward or obvious proposition.
- Modeling makes assumptions and using the assumptions defines a hypothesis space. Probabilities are assigned to the hypotheses given data observed and then inference is used to figure out which of those explanatory hypotheses is the best or are plausible.
- Many approaches exist and experts in the field do not agree on how to select the best hypothesis. In simple cases, we can use math to solve the equations directly. Optimization methods are popular such as hill climbing and its relatives. Veritable may use any such approaches, but according to certain described embodiments, Monte Carlo methods are specifically utilized in which a random walk is taken through the space of hypotheses. Random doesn't mean stupid, of course. In fact, efficiently navigating these huge spaces is a one of the innovations utilized to improve the path taken by the random walk.
-
FIG. 7 illustrates an exemplary landscape upon which a random walk may be performed. Consider the above exemplary random walk in which each axis is one dimension in the hypothesis space. Real spaces can have very many dimensions. Height of surface is probability of the hidden variables, given data and modeling assumptions. Exploration starts by taking a random step somewhere, anywhere, and if the step is higher then it is kept, but if the step is lower, then it is kept sometimes and other times it is not, electing to stay put instead. The result is seemingly magic, in which it is guaranteed to explore the space in proportion to the true probability values. Over the long run two peaks result in this example whereas simple hill climbing will get caught. Such an approach thus explores the whole space. Other innovation includes added intelligence about jumps and exploring one or many dimensions at a time. - According to exemplary models, tabular data is provided such that rows are individual entities and columns contain a certain piece of information about the respective entity (e.g., field). Different kinds of columns in one table are acceptable.
-
FIG. 8 depicts an exemplary tabular dataset. In the exemplary table rows are mammals and columns are variables that describe the mammals. Most are Boolean but some are categorical. -
FIG. 9 depicts means for deriving motivation or causal relationships between observed data. Consider the realistic problems with “real-world” data described in prior slides in which it is proffered that real-world data is not pristine. For instance, there may be data which simply doesn't matter. For instance, Some columns may not matter or certain columns may carry redundant information. A system is needed which utilizes a model that can understand the predictive relationships between all the columns. That is, some columns are predictively related and should get grouped together whereas others are not predictively related, and should be grouped separately. We call these groups of columns “views.” Within each view, the rows are grouped into categories. -
FIG. 10A depicts an exemplary cross-categorization in still further detail. - Utilizing cross-categorization, columns/features are grouped into views and rows/entities group into categories.
View 3//Category 3 from slide 19 above is expanded to the right having 8 features and 4 entities. The highlighted rows are in different categories inView 3 but are in the same category in another view. Zoom in again, and it is seen that this category contains the actual data points corresponding to the cell values in the table for just the columns in this view, and just the rows in this category. - A single cross-categorization is a particular way of slicing and dicing the table. First by column and then by row. It's a particular kind of process to yield a desired structured space. Utilizing concepts discussed above with respect to probability distributions, a probability is then assigned to each cross-categorization. More complex cross-categorizations yielding more views and more categories are less probable in and of themselves and typically are warranted only when the data really supports them.
-
FIG. 11 depicts a chart and graph of the Bell number series. A series called the Bell numbers defines the number of partitions for n labeled objects which as can be seen from the above graph on the right, grows really, really fast. A handful of objects are exemplified in the chart on the left. Plotted on the right is a plot through 200 resulting in 1e+250 or a number with 250 zeros. Now consider the massive datasets available in a cloud computing multitenant database system which could easily result in datasets of interest with thousands of columns and millions of rows. Thus, such datasets will not merely result in the Bell numbers depicted above, but rather, potentially the Bell's “squared,” placing us firmly into the land of ludicrous numbers. - These numbers are so massive that it may be helpful to consider the following context. The number in red, that is, the horizontal line near the very bottom of the plot on the right, is the total number of web pages. Thus, Google only needs to search through the 17th bell number or so. The space is so unimaginably massive that it simply is not possible to explore it exhaustively. Moreover, because it's not smooth or concave, you can't just climb the hill either.
- So where does the data come into cross-categorizations? Views pick out some of the columns and categories pick out some of the rows. Each column contains a single kind of data so each vertical strip within a category contains typed data such as numerical, categorical, etc. Now, the basic standardized distributions may be utilized more effectively. In certain embodiments, each collection of points is modeled with a single simple distribution. Basic distributions are pre-selected which work well for each data type and each selected basic distribution is only responsible for explaining a small subset of the actual data, for which it is particularly useful. Then using the mixture distribution discussed above, the basic distributions are combined such that a bunch of simple distributions are used to make a more complex one. The structure of the cross-categorization is used to chop up the data table into a bunch of pieces and each piece is modeled using the simple distribution selected of the data type, yielding a big mixture distribution of the data.
-
FIG. 12A depicts an exemplary cross categorization of a small tabular dataset. - So what does this look like? Go back to our mammals example providing a sample cross-categorization having two views. The view on the right has the habitat and feeding style columns and the rows are divided into four categories land mammals (Persian cat through Zebra), sea predators (dolphin through walrus), baleen whales (blue whale and humpback whale only), and the outlier amphibious beaver (e.g., both land and water living; we do not suggest that mammal beavers have gills). The view on the left has another division in which the primates are grouped together, large mammals are grouped, grazers are grouped, and then a couple of data oddities at the bottom (bat and seal). Even with a small dataset it is easy to imagine different ways of dividing the data up. But data is ambiguous. There is no perfect or obviously right division. For all the groupings that seemingly fit correctly, certain groupings may seem awkward or poor fitting. The systematic process of applying various models and assumptions makes tradeoffs and compromises, which is why even experts cannot agree on a single approach. Nevertheless, the means described herein permits use of a variety of available models such that these tradeoffs and compromises may be exploited to further benefit the system.
- Results are thus not limited to a single cross-categorization. Instead, a collection of them are utilized and such a collection when used together tells us about the hidden structure of the data. For instance, if they're all the same, then there was no ambiguity in the data, but such a result doesn't occur with real-world data, despite being a theoretical possibility. Conversely, if they're all completely different, that means we couldn't find any structure in the data, which sometimes happens, and requires some additional post-processing to get at the uncertainty, such as feeding in additional noise. Typically, however, something in between occurs, and some interesting hidden structure is revealed from the data.
- The specially customized cross-categorization implementation represents the core of Veritable. This core is not directly exposed to the users who interface via APIs, PreQL, and specialized utility GUIs and interfaces, but such users nevertheless benefit from the functionality which drives these other capabilities.
- The Veritable core utilizes Monte Carlo methods for certain embodiments.
-
FIG. 12B depicts an exemplary architecture having implemented data upload, processing, and predictive query API exposure in accordance with described embodiments. - First, you need to get data into the system so API calls are provided to upload data into the system. A row is the basic unit. An API call for “Analyze” kicks off a learning pass applying the specially customized CrossCat model for the uploaded data. It's also possible to specify an existing dataset, or to define a sub-set of data from a larger dataset.
- Cross-categorizations are found in the data that are most plausible explanations of the data at hand. Though such functions happy in the background out of the eyes of the user, such functionality is computationally intensive and is thus, well suited for a distributed computing structure provided by a cloud based multi-tenant database system architecture.
-
FIG. 12C is a flow diagram illustrating a method for implementing data upload, processing, and predictive query API exposure in accordance with disclosed embodiments. -
FIG. 12D depicts an exemplary architecture having implemented predictive query interface as a cloud service in accordance with described embodiments. -
FIG. 12E is a flow diagram illustrating a method for implementing predictive query interface as a cloud service in accordance with disclosed embodiments. -
FIG. 13A illustrates usage of the RELATED command term in accordance with the described embodiments. - Using PreQL, specialized queries are thus made feasible. For instance, we can ask: for a given column, what are the other columns that are predictively related to it? In terms of the cross-categorizations, we tabulate how often each of the other columns appears in the same view as the input column, thus revealing what matters and what doesn't matter.
-
FIG. 13B depicts an exemplary architecture in accordance with described embodiments. -
FIG. 13C is a flow diagram illustrating a method in accordance with disclosed embodiments. -
FIG. 14A illustrates usage of the GROUP command term in accordance with the described embodiments. -
FIG. 14B depicts an exemplary architecture in accordance with described embodiments. -
FIG. 14C is a flow diagram illustrating a method in accordance with disclosed embodiments. - Using PreQL, we can ask what rows “go together.” Such a feature can be conceptualized as clustering, except that there's more than one way to cluster. Consider the mammals example in which we additionally input a column via the PreQL query. The groups that are returned are in the context of that column.
-
FIG. 15A illustrates usage of the SIMILAR command term in accordance with the described embodiments. -
FIG. 15B depicts an exemplary architecture in accordance with described embodiments. -
FIG. 15C is a flow diagram illustrating a method in accordance with disclosed embodiments. - Using PreQL, we can ask which rows are most similar to a given row. Rows can be similar in one context but dissimilar in another. For instance, killer whales and blue whales are a lot alike in some respects, but very different in others. The input column disambiguates.
-
FIG. 16A illustrates usage of the PREDICT command term in accordance with the described embodiments. - Using PreQL, we can predict, that is, ask the system to render a prediction. With the cross-categorizations a prediction request is treated as a new row and we assign that row to categories in each cross-categorization. Then using the basic standardized distributions for each category the values we want to predict are predicted. Unlike conventional predictive analytics, the system provides for flexible predictive queries thus allowing the user of the PreQL query to specify as many or as few columns as they desire and thus allowing the system to predict as many or as few as the user wants.
- For instance, consider classification or regression in which all but one of the columns are used to predict a single target column. Veritable's core, via the APIs, can predict using a single target column or can predict using a few target columns at the user's discretion. For instance, a user can query the system asking: will an opportunity close AND at what amount? Such capabilities do not exist using conventional means.
-
FIG. 16B illustrates usage of the PREDICT command term in accordance with the described embodiments. -
FIG. 16C illustrates usage of the PREDICT command term in accordance with the described embodiments. At the extreme, predict can be utilized to predict a row without fixing anything, thus asking Veritable to make up a row that isn't actually in the underlying source data, but could be nevertheless, resulting in what may be considered a synthetic row. Such a row will exhibit all of the structure and predictive relationships as in the real data. Such a capability may enable a user to test a dataset that's realistic, but not radioactive, without having to manually enter or guess at what such data may look like. -
FIG. 16D depicts an exemplary architecture in accordance with described embodiments. -
FIG. 16E is a flow diagram illustrating a method in accordance with disclosed embodiments. -
FIG. 16F depicts an exemplary architecture in accordance with described embodiments. -
FIG. 16G is a flow diagram illustrating a method in accordance with disclosed embodiments. - Alternatively, the user can take an incomplete row and predict all of the missing values to fill in the blanks. At the extreme, the user can begin a table with many missing values and render a table where all of the blanks were filled in. Specialized tools for this particular use case are discussed below in which functionality allows the user to trade off confidence for more or less data, such that more data (or all the data) can be populated with degrading confidence or only some data is populated, above a given confidence, and so forth. A specialized GUI is additionally provided and described for this particular use case. Such a GUI calls the predict query via PreQL via an API on behalf of the user, but fundamentally exercises Veritable's core.
-
FIG. 17A depicts a Graphical User Interface (GUI) to display and manipulate a tabular dataset having missing values by exploiting a PREDICT command term. - Here the table is provided as being 61% filled. No values are predicted, but the user may simply move the slider to increase the data fill for the missing values, causing the GUI's functionality to utilize the predict function on behalf of the user.
-
FIG. 17B depicts another view of the Graphical User Interface. - Here the table is provided as being 73% filled. Some but not all values are predicted. Not depicted here is the confidence threshold which is hidden from the user. Alternative interfaces allow the user to specify such a threshold.
-
FIG. 17C depicts another view of the Graphical User Interface. - Here the table is provided as being 100% filled. All values are predicted, but it may be necessary to degrade the confidence somewhat to attain 100% fill. Though such a fill may nevertheless be feasible at acceptable levels of confidence. In each of the instances, the grey scale values show the original data and the blue values depict the predicted values which do not actually exist in the underlying table. In certain embodiments, the chosen fill level, selected by the user via the slider bar, can be “saved” to the original or a copy of the table, thus resulting in the predictive values provided being saved or input to the cell locations. Meta data can be used to recognize later that such values were predicted and not sourced.
-
FIG. 17D depicts an exemplary architecture in accordance with described embodiments. -
FIG. 17E is a flow diagram illustrating a method in accordance with disclosed embodiments. - Other specialized GUIs and API tools include business opportunity scoring, next best offer, etc. These and other are described in additional detail below.
- When making predictions, it is helpful to additionally let the users know whether they can trust the result. That is, how confident is the result and is the system literally capable of saying: “I do not know.” With such a system, the result may come back and tell the user if the answer is 1 or between 1 and 10 or between infinity and infinity.
- With probabilities, the system can advise the user that it is 90% confident that the answer given is real, accurate, and correct, or the system may alternatively return a result indicating that it simply lacks sufficient data, and thus, there is not enough known to render a prediction. For example, the Veritable core may be used to a complete data set as if it was real by filling in the missing but predicted information into a spreadsheet, from which the completed data may be used to draw conclusions, as is depicted in the above slides 33, 34, and 35. The control slider is feasible because when you complete income, for example, what is actually returned to the GUI functionality making the call is the respective persons' income distribution.
- By using such a GUI interface or such a concept in general, the user is given control over accuracy and confidence. In such a way, the user can manipulate how much data is to be filled in and to what extent the confidence level applies. What Veritable does behind the scenes is to take a table of data with a bunch of typed columns, and then the depicted Perceptible GUI interface at slides 33, 34, and 35 asks for a prediction for every single cell. Then for each cell that is missing, the Perceptible GUI gets a distribution in return from the API call to the Veritable core for the individual cell. Then when the slider is manipulated by a user, functionality for the slider looks at the distributions and looks at their variances and then gives the estimates. Thus, for any given cell having a predicted result in place of the missing null value, having seen “a” and “b” and “c” it then returns a value for that column.
- Starting with nothing more than raw data in a tabular form, such as data on paper, in a spreadsheet, or in tables of a relational database, an API call first made to upload or insert the data into the predictive database upon which Veritable operates and then an API call is made to the Veritable core to analyze the data. Upon insert, the data it looks just like all other data. But once uploaded and the analyze operation is initiated, a probabilistic model is executed against the data. So the Veritable core starts to look at the ways that the rows and the columns can interact with each other and start to build the various relationships and causations. A generated statistical index figures out how and which columns are related to another. Veritable goes through and says, for instance, these particular columns are likely to share a causal origin. The difficult problem is that Veritable must perform this analysis using real world realities rather than pristine and perfect datasets. With data that exists in the real world, some columns are junk, some columns are duplicates, some columns are heterogeneous, some columns are noisy with only sparse data, and Veritable's core functionality implementing the statistical index must be able to pull the appropriate relationships and causations out despite having to perform its analysis operations against real-world data.
- There will also be no one right answer, as there are uncertainties. So Veritable does not just build up one statistical model but rather, Veritable builds multiple statistical indices as a distribution or an ensemble of statistical indices. Veritable performs its analysis by searching through a large space for all of the ways that the data provided can possibly interact.
- The distribution of indices results in a model that is stored and is queryable by PreQL structured queries via Veritable's APIs. What Veritable figures out via the analysis operations is first how the columns group together and then how the various rows group together. The analysis thus discovers the hidden structure in the data to provide a reduced representation of a table that explains how rows and columns may be related such that they can be queried via PreQL.
-
FIG. 18 depicts feature moves and entity moves within indices generated from analysis of tabular datasets. - PreQL structured queries allow access to the queryable model and its ensemble of indices through specialized calls, including: “RELATED,” “SIMILAR,” “GROUP,” AND “PREDICT,” each of which are introduced above at
slides 28 through 32. - Beginning with PREDICT, calling an appropriate API for the PREDICT functionality enables users to predict any chosen sub-set of data predict any column/value. It is not required that an entire dataset be utilized to predict only a single value, as is typical with custom implemented models.
- Using PREDICT, the user provides or fixes the value of any column and then the PREDICT API accepts the fixed values and those you want to predict. The functionality then queries the Veritable core asking: “Given a row that has these values fixed, as provided by the user, then what would the distribution be?” For instance, the functionality could fix all but one and column in the dataset and then predict the last one, as is done with customized models. But the PREDICT functionality is far more flexible, and thus, the user can change the column to be predicted at a whim and custom implemented models simply lack this functionality as they lack the customized mathematical constructs to predict for such unforeseen columns or inquiries. That is to say, absent a particular function be pre-programmed, the models simply cannot perform this kind of varying query, for instance, for a user exploring data making multiple distinct queries or simply changing the column or columns to be predicted as business needs and the underlying data and data structures of the client organization change over time.
- Perhaps also the user does not know all the columns to fix. For instance, perhaps the dataset knows a few things about one user but lots about another user. For instance, an ecommerce site may know little about a non-registered passerby user but knows lots of information about a registered user with a rich purchase history. In such an example, the PREDICT functionality permits fixing or filling in only the stuff that is known without having to require all the data for all users, as some of the data is known to be missing. In such a way, the PREDICT functionality can still predict missing data elements with what is actually known.
- Another capability using the PREDICT functionality is to specify or fix all the data in a dataset that is known, that is, non-null, and then fill in everything else. In such a way, a user can say that what is known in the dataset is known, but much data is understood to be missing, but render predictions for the data nevertheless. The PREDICT operation would thus increase the population of predicted data for missing or null-values by accepting decreasing confidence, until the all or a specified population percentage of data is reached, much like the Perceptible GUI and slider examples described above.
- Another functionality using PREDICT is to fill in an empty set. So maybe data is wholly missing, and then you start generating data that represents new rows and the new data in those rows represents plausible data, albeit synthetic data.
- In other embodiments, PREDICT can be used to populate data elements that are not known but should be present or may be present, yet are not filled in within the data set, thus allowing the PREDICT functionality to populate such data elements.
- Another example is to use PREDICT to attain a certainty or uncertainty for any element and to display or return the range of plausible values or the element.
- Next is the RELATED functionality. Given a table with columns or variables in it, Veritable's analysis behind the scenes divides the columns or variables into groups and because of the distributions there is more than one way to divide these columns or variables up. Take height for example. Giving the height column to an API call for RELATED, a user can query: “How confident can I be about the probability of the relationship existing in all the other columns with the height column so specified.” Then what is returned from the Veritable core for height is a confidence for every other column in the dataset which was not specified. So for example, the RELATED functionality may return for confidence to the height column, “Weight=1.0,” meaning that Veritable, according to the dataset, is extremely confident that there is a relationship between weight and height. Such a result is somewhat intuitive and expected. But other results may be less intuitive and thus provide interesting results for exploration and additional investigation. Continuing with the “height” example for the specified column to a RELATED API call, Veritable may return “Age=0.8” meaning that Veritable is quite sure, but not perfectly certain, due to, for instance, noisy data which precludes an absolute positive result. Perhaps also returned for the specified “height” column is “hair color=0.1” meaning there is realistically no correlation whatsoever between a person's height and their hair color, according to the dataset utilized. Thus, the RELATED functionality permits a user to query for what matters for a given column, such as the height column, and the functionality returns all the columns with a scoring of how related the columns are to the specified column, based on their probability.
- Next is the SIMILAR functionality. Like the RELATED functionality, an API call to Veritable for SIMILAR accepts a row and then returns what other rows are most similar to the row specified Like the RELATED examples, the SIMILAR functionality returns the probability that a row specified and any respective returned row actually exhibits similarity. For instance, rather than specifying column, you specify “Fred” as a row in the dataset. Then you ask using the SIMILAR functionality, for “Fred,” what rows are scored based on probability to be the most like “Fred.” The API call can return all rows scored from the dataset or return only rows above or below a specified threshold. For instance, perhaps rows above 0.8 are the most interesting or the rows below 0.2 are most interesting, or both, or a range. Regardless, SIMILAR scores every row for the specified row and returns the rows and the score based on probably according to the user's constraints or the constraints of an implementing GUI, if any such constraints are given. Because the Veritable system figures out these relationships using its own analysis, there is more than way to evaluate for this inquiry. Thus, user must provide to an API call for SIMILAR the specified row to find and additionally the COLUMN which provides how you the user constructing the PreQL query or API call actually cares about the data. Thus, the API call requires both row and column to be fixed. In such a way, providing, specifying, or fixing the column variable provides disambiguation information to Veritable and the column indication tells the Veritable core where to enter the index. Otherwise there would be too many possible ways to score the returned rows as the Veritable core could not disambiguate how the caller cares about the information for which a similarity is sought.
- Next is the GROUP functionality. Sometimes rows tend to group up on noisy elements in the dataset when the Veritable core applies its analysis; yet these elements may result in groupings that are not actually important. We know that each column will appear in exactly one of the groups as a view and so Veritable permits using that column to identify the particular “view” that will be utilized. The GROUP functionality therefore implements a row centric operation like the SIMILAR functionality, but in contrast to an API call for SIMILAR where you must give a row and the SIMILAR call returns back a list of other rows and a score based on their probabilities of being related, with the GROUP functionality, the API call requires no row to be given or fixed whatsoever. Only a column is thus provided when making a call to the GROUP functionality.
- Calling the GROUP functionality with a specified or fixed column causes the functionality to return the groupings of the ROWS that seem to be related or correlated in some way based on Veritable's analysis.
- In such a way, use of the PreQL structure queries permits programmatic queries into the predictive database in a manner similar to a programmer making SQL queries into a relational database. Rather than a “select” statement in the query the term is replaced with the “predict” or “similar” or “related” or “group” statements. For instance, an exemplary PreQL statement may read as follows: “PREDICT IS_WON, DOLLAR_AMOUNT FROM OPPORTUNITY WHERE STAGE=‘QUOTE’.” So in this example, “QUOTE” is the fixed column, “FROM” is the dataset from which an opportunity is to be predicted, the “PREDICT” term is the call into the appropriate function, “IS_WON” is the value to be predicted, that is to say, the functionality is to predict whether a given opportunity is likely or unlikely to be won where the “IS_WON” may have completed data for some rows but be missing for other rows due to, for example, pending or speculative opportunities, etc. “DOLLAR_AMOUNT” is the fixed value.
- In certain embodiments, the above query is implemented via a specialized GUI interface which accepts inputs from a user via the GUI interface and constructs, calls, and returns data via the PREDICT functionality on behalf of the user without requiring the user actually write or even be aware of the underlying PreQL structure query made to the Veritable core.
- There are additionally provide Veritable use case implementation embodiments and customized GUIs. According to a first embodiment, a specialized GUI implementation enables users to filter on a historical value by comparing a historical value versus a current value in a multi-tenant system. Filtering for historical data using a GUI's field option wherein the GUI displays current fields related to historical fields as is depicted.
-
FIG. 19A depicts a specialized GUI to query using historical dates. - Embodiments provide for the ability to filter historical data by comparing historical value versus a constant in a multi-tenant system. The embodiments utilize the Veritable core by calling the appropriate APIs to make queries on behalf of the GUI users. The GUI performs the query and then consumes the data which is then presented back to the end users via the interface. Consider for example, a sales person looking at the sales information in a particular data set. The interface can take the distributions provided by Veritable's core and produce a visual indication for ranking the information according to a variety of customized solutions and use cases.
- For instance, in a particular embodiment, systems and methods for determining the likelihood of an opportunity to close using only closed opportunities is provided.
- SalesCloud is an industry leading CRM application currently used by 125,000 enterprise customers. Customers see the value of storing the data in the Cloud. These customers appreciate a web based interface to view and act on their data, and these customers like to use report and dashboard mechanisms provided by the cloud based service. Presenting these various GUIs as tabs enables salespeople and other end users to explore their underlying dataset in a variety of ways to learn how their business is performing in real-time. These users also rely upon partners to extend the provided cloud based service capabilities through APIs.
- A cloud based service that offers customers the opportunity to learn from the past and draw data driven insights is highly desirable as such functionality should help these customers make intelligent decisions about the future for their business based on their existing dataset.
- The customized GUIs utilize Veritable's core to implement predictive models which may vary per customer organization or be tailored to a particular organizations needs via programmatic parameters and settings exposed to the customer organization to alter the configuration and operation of Veritable's functionality.
- For instance, a GUI may be provided to compute and assign an opportunity score based on probability for a given opportunity reflecting the likelihood of that opportunity to close as a win or loss. The data set to compute this score would consists of all the opportunities that have been closed (either won/loss) in a given period of time, such as 1, 2, or 3 years or a lifetime of an organization, etc. Additional data elements from the customer organization's dataset may also be utilized, such as the account object as an input. Machine learning techniques implemented via Veritable's core, such as SVN, Regression, Decision Trees, PGM, etc., are then used to build an appropriate model to render the opportunity score and then the GUI depicts the information to the end user via the interface.
- Systems and methods for determining the likelihood of an opportunity to close using historical trending data is additionally disclosed. For instance, a historical selector for picking relative or absolute dates is described.
-
FIG. 19B depicts an additional view of a specialized GUI to query using historical dates. - With this example we enable users to look at how an opportunity has changed over time, independent of stage, etc. The user can additionally look at how that opportunity has matured from when created until when it was closed.
- Systems and methods for determining the likelihood of an opportunity to close at a given stage using historical trending data. Where the above example operates independent of stage of the sales opportunity this example further focuses on the probability of closing at a given stage as a further limiting condition for the closure. Thus, customers are enabled to use the historical trending data to know exactly when the stage has changed and then additionally predict what factors were involved to move from
stage 1 to 2, fromstage 2 to 3 and so forth. - Systems and methods for determining the likelihood for an opportunity to close given social and marketing data is additionally disclosed. In this example, the dataset of the customer organization or whomever is utilizing the system is expanded on behalf of the end user beyond that which is specified and then that additionally information is utilized to further influence and educate the predictive models. For instance, certain embodiments pull information from an exemplary website such as “data.com,” and then the data is associated with each opportunity in the original data set to discover further relationships, causations, and hidden structure which can then be presented to the end user. Other data sources are equally feasible, such as pulling data from social networking sites, search engines, data aggregation service providers, etc.
- In one embodiment, social data is retrieved and a sentiment is provided to the end-user via the GUI to depict how the given product is viewed by others in a social context. Thus, a salesperson can look at the persons linked in profile and with information from data.com or other sources the salesperson can additionally be given sentiment analysis in terms of social context for the person that the salesperson is actually trying to sell to. For instance, has the target purchaser commented about other products or have they complained about any other products, etc. Each of these data points and others based may help influence the model employed by Veritable's core to render a prediction.
- Systems and methods for determining the likelihood for an opportunity to close given industry specific data is additionally disclosed. For instance, rather than using socially relevant data for social context of sentiment analysis, industry specific data can be retrieved and input to the predictive database upon which Veritable performs its analysis as described above, and from which further exploration can then be conducted by users of the dataset now having the industry specific data integrated therein.
- According to other embodiments, datasets are explored beyond the boundaries of any particular customer organization having data within the multi-tenant database system. For instance, in certain embodiments, benchmark predictive scores are generated based on industry specific learning using cross-organizational data stored within the multi-tenant database system. For example, data mining may be performed against telecom specific customer datasets, given their authorization or license to do so. Such cross-organization data to render a much larger multi-tenant dataset can then be analyzed via Veritable and provide insights, relationships, causations, and additional hidden structure that may not be present within a single customer organizations' dataset. For instance, if as a customer you are trying to close a $100 k deal in the NY-NJ-Virginia tri-city area, the probability for that deal to close in 3 months may be, according to such analysis, 50% because past transactions have shown that it could take up to six months to close a $100 k telecom deal in NY-NJ-Virginia tri-city area when viewed in the context of multiple customer organizations' datasets. Many of the insights realized through such a process may be non-intuitive, yet capable of realization through application of the techniques described herein.
- With industry specific data present within a given dataset it is possible to delve even deeper into the data and identify benchmark using such data for a variety of varying domains across multiple different industries. For instance, based on such data predictive analysis may review that, in a given region it takes six months to sell sugar in the mid west and it takes three months to sell laptops in the east coast, and so forth.
- Then if a new opportunity arises and a vendor is trying to, for example, sell watches in California, the vendor can utilize such information to gain a better understanding of the particular regional market based on the predictions and confidence levels given.
- Provided functionality can additionally predict information for a vertical sector as well as for the region. When mining a customer organization's dataset a relationship may be discovered that where customers bought a, those customers also bought b. These kinds of matching relationships are useful, but can be further enhanced. For instance, using the predictive analysis of Veritable it is additionally possible to identify the set of factors that led to a particular opportunity score (e.g., a visualized presentation of such analysis).
-
FIG. 19C depicts another view of a specialized GUI to configure predictive queries. - Thus, the GUI presents a 42% opportunity at the user interface but when the user mousse over the opportunity score, the GUI then displays sub-detailed elements that make up that opportunity score. The GUI makes the necessary Veritable based API calls on behalf of the user such that an appropriate call is made to the predictive platform to pull the opportunity score and display that information to the user as well as the sub-detail relationships and causations considered relevant.
- The GUI can additionally leverage the predict and analyze capabilities of Veritable which upon calling a predict function for a given opportunity will return data necessary to create a histogram for an opportunity. So not only can the user be given a score, but the user can additionally be given the factors and guidance on how to interpret the information provided and what to do with such information.
- Moreover, as the end-users, such as salespersons, see the data and act upon it, a feedback loop is created through which further data is input into the predictive database upon which additional predictions and analysis are carried out in an adaptive manner. For example, as the Veritable core learns more about the data the underlying models may be refreshed on a monthly basis by re-performing the analysis of the dataset so as to re-calibrate the data using the new data obtained via the feedback loop.
- Additionally disclosed are systems and methods to deliver a matrix report for historical data in a multi-tenant system. For instance, consider a summary view of a matrix report according to provided embodiments.
- Systems and methods to deliver a matrix report for historical data in a multi-tenant system follow the established matrix report format familiar to salesforce.com customers but which is limited only to current data and cannot display historical data. With Veritable's capabilities the historical data can additionally be provided via the matrix reports.
-
FIG. 19D depicts an exemplary architecture in accordance with described embodiments. -
FIG. 19E is a flow diagram illustrating a method in accordance with disclosed embodiments. -
FIG. 20A depicts a pipeline change report in accordance with described embodiments. For example, a user can request to be shown the open pipeline for the current month by stage. - In a summary view, users can see data in an aggregate fashion. Each stage may consist of multiple opportunities and each might be able to be duplicated because each might change according to the amount or according to the stage, etc. Thus, if a user is looking at the last four weeks, then one opportunity may change from $500 to $1500 and thus be duplicated.
- The cloud computing architecture executes functionality which runs across all the data for all tenants. Thus, for any cases, leads, and opportunities, the database maintains a history object into which all of audit data is retained such that a full and rich history can later be provided to the user at their request to show the state of any event in the past, without corrupting the current state of the data. Thus, while the underlying data must be maintained in its correct state for the present moment, a user may nevertheless utilize the system to display the state of a particular opportunity as it stood last week, or as it transitioned through past quarter, and so forth.
- All of the audit data from history objects for various categories of data is then aggregated into a historical trending entity which is a custom object that stores any kind of data. This object is then queried by the different historical report types across multiple tenants to retrieve the necessary audit trail data such that any event at any time in the past can be re-created for the sake of reporting, predictive analysis, and exploration. The historical audit data may additionally be subjected to the analysis capabilities of Veritable by including it within a historical dataset for the sake of providing further predictive capabilities.
- The algorithms to provide historical reporting capabilities are applied across all the tenant data which is common within the historical trending data object and the interim opportunity history and lead history, etc.
- Within the matrix report, the data can also be visualized using salesforce.com's charting engine as depicted by the waterfall diagram.
-
FIG. 20B depicts a waterfall chart using predictive data in accordance with described embodiments. - Systems and methods to deliver waterfall charts for historical data in a multi-tenant system are thus provided. For instance, on the x axis is the weekly snapshot for all the opportunities being worked. The amounts are changing up and down and then are also grouped by stages. The waterfall enables a user to look at two points in time and by defining opportunities between day one and day two. Alternatively, waterfall diagrams can be used to group all opportunities into different stages as in the example above which every opportunity is mapped according to its stage allowing a user to look into the past and understand what the timing is for these opportunities to actually come through to closure.
- Historical data and the audit history saved to the historical trending data object are enabled through snapshots and field history. Using the historical trending data object the desired data can then be queried. The historical trending data object may be implemented as one table with indexes on the table from which any of the desired data can then be retrieved. The software modules implementing the various GUIs and use cases populate the depicted table using the opportunity data retrieved from the historical trending data object's table.
- Additionally disclosed are systems and methods to deliver waterfall chart for historical data in a multi-tenant system and historical trending in a multi-tenant environment. Systems and methods for using a historical selector for picking relative or absolute dates are additionally described.
- These specialized implementations enable users to identify how the data has changed on a day to day basis or week to week basis or over a month to month basis, etc. The users can therefore can see the data that is related to the user's opportunities not just for the present time, but with this feature, the users can identify opportunities based on a specified time such as absolute time or relative time, so that they can see how the opportunity has changed over time. In this embodiment, time as a dimension is used to then provide a decision tree for the customers to pick either absolute date or a range of dates. Within the date customers can pick an absolute date, such as Jan. 1, 2013 or a relative date such as the first day of the current month or first day of the last month, etc.
- This solves the problem of a sales manager or sales person needing to see how the opportunity has changed today versus the first day of this month or last month. With this capability, the user can take a step back in time, thinking back where they were a week ago, or a month ago and identify the opportunity by creating a range of dates and displaying what opportunities were created during those dates.
- Thus, a salesperson wanting such information may have had ten opportunities and on Feb. 1, 2013, the salesperson's target buyer expresses interest in a quote, so the stage changes from prospecting to quotation. Another target buyer, however, says they want to buy immediately, so the state changes from quotation to sale/charge/close. The functionality therefore provides a back end which implements a decision tree with the various dates that are created. The result is that the functionality can give the salesperson a view of all the opportunities that are closing in the month of January, or February, or within a given range, etc.
- The query for dates is unique because it is necessary to traverse the decision tree to get to the date the user picks and then enabling the user to additionally pick the number of snap shots, from which the finalized result set is determined, for instance, from Feb. 1 to Feb. 6, 2013.
- Additionally described is the ability to filter historical data by comparing historical values versus current values in a multi-tenant system as is shown.
-
FIG. 20C depicts an interface with defaults after adding a first historical field. - Additionally enabled is the ability to filter historical data by comparing historical values versus a constant in a multi-tenant system, referred to as a historical selector. Based on the opportunity or report type, the customer has the ability to filter on historical data using a custom historical filter. The interface provides the ability for the customer to look at all of the filters on the left that they can use to restrict a value or a field, thus allowing customers to filter on historical column data for any given value. Thus, a customer may look at all of the open opportunities for a given month or filter the data set according to current column data rather than historical. Thus, for a given opportunity a user at the interface can fill out the amount, stage, close date, probability, forecast category, or other data elements and then as the salesperson speaks with the target buyer, the state is changed from prospecting to quoting, to negotiation based on the progress that is made with the target buyer, and eventually to a state of won/closed or lost, etc. So maybe the target buyer is trying to decrease the amount of the deal and the salesperson is trying to increase the amount. All of that data and state changes (e.g., a change in amount can be a state change within a given phase of the deal) and the information is stored in the historical opportunity object which provides the audit trail.
- As the current data changes the data in the current tables change and thus, historical data is not accessible to the customer. But the audit trail is retained and so it can be retrieved. For instance, the GUI enables a user to go back 12 months according to one embodiment. Such historical data and audit data may be processed with granularity of one day, and thus, a salesperson can go back in time and view how the data has changed overtime with within the data set with the daily granular reporting.
- Thus, for any given opportunity object with all the object history and the full audit trail with daily granular data the historical trending entity object is used to allow the tool to pull the information about how these opportunities changed over time for the salesperson. Such metrics would be useful to other disciplines also, such as a service manager running a call center who gets 100 cases from sales agents wants to know how to close those calls, etc. Likewise, when running a marketing campaign, is it being spent in California, or Tokyo, etc., the campaign managers will want to know how to close the various leads an opportunities as well as peer back into history to see how events influenced the results of past opportunities.
- Additional detail with respect to applying customized filters to historical data is further depicted, as follows:
-
FIG. 20D depicts in additional detail an interface with defaults for an added custom filter. -
FIG. 20E depicts another interface with defaults for an added custom filter. -
FIG. 20F depicts an exemplary architecture in accordance with described embodiments. -
FIG. 20G is a flow diagram illustrating a method in accordance with disclosed embodiments. - According to one embodiment, the historical trending entity object is implemented via a historical data schema in which history data is stored in a new table core.historical_entity_data as depicted at Table 1 below:
-
TABLE 1 column name data type nullable notes organization_id char(15) no key_prefix char(3) no key prefix of historical data itself historical_entity_data_id char(15) no parent_id char(15) no FK to the parent record transaction_id char(15) no generated key used to uniquely identify transaction that changed the parent record. Main purpose is to reconcile multiple changes that may occur in one transaction (custom field versus standard field, for example may be written separately) and enable asynchronous fixer opertations (if used). division number no currency_iso_code char(3) no deleted char(1) no row_version number no standard audit fields valid_from date no with valid_to, defines time period the data is valid. The time periods (valid_from, valid_to) for each snapshot of the same parent don't overlap. Gaps are allowed. valid_to date no default to 3000/1/1 for current data val0 . . . val800 varchar(765) yes flex fields for storing historic values - Indices utilized in the above Table 1 include: organization_id, key_prefix, historic_entity_data_id. PK includes: organization_id, key_prefix, system_modstamp. Unique, find, and snapshot for given date and parent record: organization_id, key_prefix, parent_id, valid_to, valid_from. Indices organization_id, key_prefix, valid_to facilitate data clean up. Such a table is additionally counted against storage requirements according to certain embodiments. Usage may be capped at 100. Alternatively, when available slots are running low, old slots may be cleaned. Historical data management, row limits, and statistics may be optionally utilized. For new history the system assumes an average 20 byte per column and 60 effective columns (50 effective data columns+PK+audit fields) for the new history table, and thus, row size is 1200 bytes. For row estimates the system assumes that historical trending will have usage patterns similar to entity history. Since historical trending storage is charged to the customer's applicable resource limits, it is expected that usage will not be heavier than usage of entity history.
- Sampling of production data revealed recent grow in row count for entity history is ˜2.5 B (billion) rows/year. Since historical trending will store a single row for any number of changed fields, an additional factor of 0.78 can be applied. Since historical trending will only allow 4 standard and at most 5 custom objects, additional factor of 0.87 can be used to only include top standard and custom objects contributing to entity history. With additional factor of 0.7 to only include UE and EE organizations, the expected row count for historical trending is 1.2 B row/year in the worst case scenario.
- Historical data may be stored by default for 2 years and the size of the table is expected to stay around 2.4 B rows. Custom value columns are to be handled by custom indexes similar to custom objects. To prevent unintentional abuse of the system, for example, by using automated scripts, each organization will have a history row limit for each object. Such a limit could be between approximately 1 and 5 million rows per object which is sufficient to cover storage of current data as well as history data based on analyzed usage patterns of production data with only very few organizations (3-5) occasionally having so many objects that they would hit the configurable limit. The customized table can be custom indexed to help query performance.
- High level use cases for such historical based data in a dataset to be analyzed and subjected to Veritable's predictive analysis include: Propensity to Buy and Lead Scoring for sales representatives and marketing users. For instance, sales users often get leads from multiple sources (marketing, external, sales prospecting etc.) and often times, in any given quarter, they have more leads to follow up than they have time. Sales representatives often need guidance with key questions such as: which leads have the highest propensity to buy, what is the likelihood of a sale, what is the potential revenue impact if this lead is converted to an opportunity, what is the estimated sale cycle based on historical observations if this lead is converted to an opportunity, what is the lead score for each of their leads in their pipeline so that sales representatives can discover the high potential sales leads in their territories, and so forth. Sales representatives may seek to determine the top ten products each account will likely buy based on the predictive analysis and the deal sizes if they successfully close, the length of the deal cycle based on the historical trends of similar accounts, and so forth. When sales representatives act on these recommendations, they can broaden their pipeline and increase their chance to meet or exceed quota, thus improving sales productivity, business processes, prospecting, and lead qualification.
- Additional use cases for such historical based data may further include: likelihood to close/win and opportunity scoring. For instance, sales representatives and sales managers may benefit from such data as they often have many few deals in their current pipeline and must juggle where to apply their time and attention in any month/quarter. As these sales professionals approach the end of the sales period, the pressure to meet their quota is of significant importance. Opportunity scoring can assist with ranking the opportunities in the pipeline based on the probability of such deals to close, thus improving the overall effectiveness of these sales professionals.
- Data sources may include such data as: Comments, sales activities logged, standard field numbers for activities (e.g., events, log a call, tasks etc.), C-level customer contacts, decision maker contacts, close dates, standard field numbers for times the close date has pushed, opportunity competitors, standard field opportunities, competitive assessments, executive sponsorship, standard field sales team versus custom field sales team as well as the members of the respective teams, chatter feed and social network data for the individuals involved, executive sponsor involved in a deal, DSRs (Deal Support Requests), and other custom fields.
- Historical based data can be useful to Veritable's predictive capabilities for generating metrics such as Next Likelihood Purchase (NLP) and opportunity whitespace for sales reps and sales managers. For instance, a sales rep or sales manager responsible for achieving quarterly sales targets will undoubtedly be interested in: which types of customers are buying which products; which prospects most resemble existing customers; are the right products being offered to the right customer at the right price; what more can we sell to my customer to increase the deal size, and so forth. Looking at historical data of things that similar customers have purchased to uncover selling trends, and using such metrics yields valuable insights to make predictions about what customers will buy next, thus improving sales productivity and business processes.
- Another capability provided to end users is to provide customer references on behalf of sales professionals and other interested parties. When sales professions require customer references for potential new business leads they often spend significant time searching through and piecing together such information from CRM sources such as custom applications, intranet sites, or reference data captured in their databases. However, the Veritable core and associated use case GUIs can provide key information to these sales professions. For instance, the application can provide data including that is grouped according to industry, geography, size, similar product footprint, and so forth, as well as provide in one place what reference assets are available for those customer references, such as customer success stories, videos, best practices, which reference customers are available to chat with a potential buyer, customer reference information grouped according to the contact person's role, such as CIO, VP of sales, etc., which reference customers have been over utilized and thus may not be good candidate references at this time, who are the sales representatives or account representatives for those reference customers at the present time or at any time in the past, who is available internal to a organization to reach or make contact with the reference customer, and so forth. This type of information is normally present in database systems but is not organized in such a way and is extremely labor intensive to retrieve, however, Veritable's analysis can identify such relationships and hidden structure in the data which may then be retrieved and displayed by specialized GUI interfaces for end-users. Additionally, the functionality can identify the most ideal or the best possible reference customer among many based on predictive analysis which builds the details of a reference customer into a probability to win/close the opportunity, which is data wholly unavailable from conventional systems.
- In other embodiments, filter elements are provided to the user to narrow or limit the search according to desire criteria, such as industry, geography, deal size, products in play etc. Such functionality thus aids sales professionals with improving sales productivity and business processes.
- According to other embodiments, functionality is provided to predict forecast adjustments on behalf of sales professionals. For instance, businesses commonly have a system of sales forecasting as part of their critical management strategy. Yet, such forecasts are, by nature, inexact. The difficulty is knowing which direction such forecasts are wrong and then turning that understanding into an improved picture of how the business is doing. Veritable's functionality can improve such forecasting using existing data of a customer organization including existing forecasting data. For instance, applying Veritable's analysis to past forecast to the business can aid in trending and with improving existing forecasts into the future which have yet to be realized. Sales managers are often asked to provide their judgment or adjustment on forecasting data for their respective sales representatives which requires such sales managers to aggregate their respective sales representatives' individual forecasts. This is a labor intensive process which tends to induce error. Sales managers are intimately familiar with their representatives' deals and they spend time reviewing them on a periodic basis as part of a pipeline assessment. Improved forecasting results can aid such managers with rendering improved judgments and assessments as well as help with automating the aggregating function which is often carried out manually or using inefficient tools, such as spreadsheets, etc.
- In such an embodiment, Veritable's analysis function mines past forecast trends by the sales representatives for relationships and causations such as forecast versus quota versus actual for the past eight quarters or other time period, and then provides a recommended judgment and/or adjustment that should be applied to the current forecast. By leveraging the analytical assessment at various levels of the forecast hierarchy, organizations can reduce the variance between individual sales representative's stipulated quotas, forecasts, and actuals, over a period of time, thereby narrowing deltas between forecast and realized sales via improved forecast accuracy.
- Additional functionality enables use case GUI interfaces to render a likelihood to renew an opportunity or probability of retention for an opportunity by providing a retention score. Such functionality is helpful to sales professionals as such metrics can influence where a salesperson's time and resources are best spent so as to maximize revenue.
-
FIG. 21A provides a chart depicting prediction completeness versus accuracy. -
FIG. 21B provides a chart depicting an opportunity confidence breakdown. -
FIG. 21C provides a chart depicting an opportunity win prediction. -
FIG. 22A provides a chart depicting predictive relationships for opportunity scoring. -
FIG. 22B provides another chart depicting predictive relationships for opportunity scoring. -
FIG. 22C provides another chart depicting predictive relationships for opportunity scoring. - Unbounded Categorical Data types model categorical columns where new values that are not found in the dataset can show up. For example, most opportunities will be replacing one of a handful of common existing systems, such as an Oracle implementation, but a new opportunity might be replacing a new system which has not been seen in the data ever before.
-
FIG. 1 depicts an alternative exemplaryarchitectural overview 300 of the environment in which embodiments may operate. In particular, there are depicted multiple customer organizations 305A, 305B, and 305C. Obviously, there may be many more customer organizations than those depicted. In the depicted embodiment, each of the customer organizations 305A-C includes at least one client device 306A, 306B, and 306C. A user may be associated with such a client device, and may further initiate requests to thehost organization 310 which is connected with the various customer organizations 305A-C and client devices 306A-C via network 325 (e.g., such as via the public Internet), thus establishing a relationship between the cloud based services provider and the customer organizations. - The client devices 306A-C each individually transmit
request packets 316 to theremote host organization 310 via the network 325. Thehost organization 310 may responsively send response packets 315 to the originating customer organization to be received via the respective client devices 306A-C. Such interactions thus establish the communications necessary to transmit and receive information in fulfillment of the described embodiments on behalf of each the customer organizations and thehost organization 310 providing the cloud based computing services including access to the Veritable functionality described. - Within
host organization 310 is a request interface 375 which receives the packet requests 315 and other requests from the client devices 306A-C and facilitates the return ofresponse packets 316. Further depicted is a PreQL query interface 380 which operates to query the predictive database 350 in fulfillment of such request packets from the client devices 306A-C, for instance, issuing API calls for PreQL structure query terms such as “PREDICT,” “RELATED,” “SIMILAR,” and “GROUP.” Also available are the API calls for “UPLOAD” and “ANALYZE,” so as to upload new data sets or define datasets to the predictive database 350 and trigger the Veritable core 390 to instantiate analysis of such data. Server side application 385 may operate cooperatively with the various client devices 306A-C. Veritable core 390 includes the necessary functionality to implement the embodiments described herein. -
FIG. 2 illustrates a block diagram of an example of anenvironment 210 in which an on-demand database service might be used.Environment 210 may includeuser systems 212,network 214,system 216,processor system 217,application platform 218, network interface 220,tenant data storage 222,system data storage 224,program code 226, andprocess space 228. In other embodiments,environment 210 may not have all of the components listed and/or may have other elements instead of, or in addition to, those listed above. -
Environment 210 is an environment in which an on-demand database service exists.User system 212 may be any machine or system that is used by a user to access a database user system. For example, any ofuser systems 212 can be a handheld computing device, a mobile phone, a laptop computer, a work station, and/or a network of computing devices. As illustrated inFIG. 2 (and in more detail inFIG. 3 )user systems 212 might interact via anetwork 214 with an on-demand database service, which issystem 216. - An on-demand database service, such as
system 216, is a database system that is made available to outside users that do not need to necessarily be concerned with building and/or maintaining the database system, but instead may be available for their use when the users need the database system (e.g., on the demand of the users). Some on-demand database services may store information from one or more tenants stored into tables of a common database image to form a multi-tenant database system (MTS). Accordingly, “on-demand database service 216” and “system 216” is used interchangeably herein. A database image may include one or more database objects. A relational database management system (RDMS) or the equivalent may execute storage and retrieval of information against the database object(s).Application platform 218 may be a framework that allows the applications ofsystem 216 to run, such as the hardware and/or software, e.g., the operating system. In an embodiment, on-demand database service 216 may include anapplication platform 218 that enables creation, managing and executing one or more applications developed by the provider of the on-demand database service, users accessing the on-demand database service viauser systems 212, or third party application developers accessing the on-demand database service viauser systems 212. - The users of
user systems 212 may differ in their respective capacities, and the capacity of aparticular user system 212 might be entirely determined by permissions (permission levels) for the current user. For example, where a salesperson is using aparticular user system 212 to interact withsystem 216, that user system has the capacities allotted to that salesperson. However, while an administrator is using that user system to interact withsystem 216, that user system has the capacities allotted to that administrator. In systems with a hierarchical role model, users at one permission level may have access to applications, data, and database information accessible by a lower permission level user, but may not have access to certain applications, database information, and data accessible by a user at a higher permission level. Thus, different users will have different capabilities with regard to accessing and modifying application and database information, depending on a user's security or permission level. -
Network 214 is any network or combination of networks of devices that communicate with one another. For example,network 214 can be any one or any combination of a LAN (local area network), WAN (wide area network), telephone network, wireless network, point-to-point network, star network, token ring network, hub network, or other appropriate configuration. As the most common type of computer network in current use is a TCP/IP (Transfer Control Protocol and Internet Protocol) network, such as the global internetwork of networks often referred to as the “Internet” with a capital “I,” that network will be used in many of the examples herein. However, it is understood that the networks that the claimed embodiments may utilize are not so limited, although TCP/IP is a frequently implemented protocol. -
User systems 212 might communicate withsystem 216 using TCP/IP and, at a higher network level, use other common Internet protocols to communicate, such as HTTP, FTP, AFS, WAP, etc. In an example where HTTP is used,user system 212 might include an HTTP client commonly referred to as a “browser” for sending and receiving HTTP messages to and from an HTTP server atsystem 216. Such an HTTP server might be implemented as the sole network interface betweensystem 216 andnetwork 214, but other techniques might be used as well or instead. In some implementations, the interface betweensystem 216 andnetwork 214 includes load sharing functionality, such as round-robin HTTP request distributors to balance loads and distribute incoming HTTP requests evenly over a plurality of servers. At least as for the users that are accessing that server, each of the plurality of servers has access to the MTS' data; however, other alternative configurations may be used instead. - In one embodiment,
system 216, shown inFIG. 2 , implements a web-based customer relationship management (CRM) system. For example, in one embodiment,system 216 includes application servers configured to implement and execute CRM software applications as well as provide related data, code, forms, webpages and other information to and fromuser systems 212 and to store to, and retrieve from, a database system related data, objects, and Webpage content. With a multi-tenant system, data for multiple tenants may be stored in the same physical database object, however, tenant data typically is arranged so that data of one tenant is kept logically separate from that of other tenants so that one tenant does not have access to another tenant's data, unless such data is expressly shared. In certain embodiments,system 216 implements applications other than, or in addition to, a CRM application. For example,system 216 may provide tenant access to multiple hosted (standard and custom) applications, including a CRM application. User (or third party developer) applications, which may or may not include CRM, may be supported by theapplication platform 218, which manages creation, storage of the applications into one or more database objects and executing of the applications in a virtual machine in the process space of thesystem 216. - One arrangement for elements of
system 216 is shown inFIG. 2 , including a network interface 220,application platform 218,tenant data storage 222 fortenant data 223,system data storage 224 forsystem data 225 accessible tosystem 216 and possibly multiple tenants,program code 226 for implementing various functions ofsystem 216, and aprocess space 228 for executing MTS system processes and tenant-specific processes, such as running applications as part of an application hosting service. Additional processes that may execute onsystem 216 include database indexing processes. - Several elements in the system shown in
FIG. 2 include conventional, well-known elements that are explained only briefly here. For example, eachuser system 212 may include a desktop personal computer, workstation, laptop, PDA, cell phone, or any wireless access protocol (WAP) enabled device or any other computing device capable of interfacing directly or indirectly to the Internet or other network connection.User system 212 typically runs an HTTP client, e.g., a browsing program, such as Microsoft's Internet Explorer browser, a Mozilla or Firefox browser, an Opera, or a WAP-enabled browser in the case of a smartphone, tablet, PDA or other wireless device, or the like, allowing a user (e.g., subscriber of the multi-tenant database system) ofuser system 212 to access, process and view information, pages and applications available to it fromsystem 216 overnetwork 214. Eachuser system 212 also typically includes one or more user interface devices, such as a keyboard, a mouse, trackball, touch pad, touch screen, pen or the like, for interacting with a graphical user interface (GUI) provided by the browser on a display (e.g., a monitor screen, LCD display, etc.) in conjunction with pages, forms, applications and other information provided bysystem 216 or other systems or servers. For example, the user interface device can be used to access data and applications hosted bysystem 216, and to perform searches on stored data, and otherwise allow a user to interact with various GUI pages that may be presented to a user. As discussed above, embodiments are suitable for use with the Internet, which refers to a specific global internetwork of networks. However, it is understood that other networks can be used instead of the Internet, such as an intranet, an extranet, a virtual private network (VPN), a non-TCP/IP based network, any LAN or WAN or the like. - According to one embodiment, each
user system 212 and all of its components are operator configurable using applications, such as a browser, including computer code run using a central processing unit such as an Intel Pentium® processor or the like. Similarly, system 216 (and additional instances of an MTS, where more than one is present) and all of their components might be operator configurable using application(s) including computer code to run using a central processing unit such asprocessor system 217, which may include an Intel Pentium® processor or the like, and/or multiple processor units. - According to one embodiment, each
system 216 is configured to provide webpages, forms, applications, data and media content to user (client)systems 212 to support the access byuser systems 212 as tenants ofsystem 216. As such,system 216 provides security mechanisms to keep each tenant's data separate unless the data is shared. If more than one MTS is used, they may be located in close proximity to one another (e.g., in a server farm located in a single building or campus), or they may be distributed at locations remote from one another (e.g., one or more servers located in city A and one or more servers located in city B). As used herein, each MTS may include one or more logically and/or physically connected servers distributed locally or across one or more geographic locations. Additionally, the term “server” is meant to include a computer system, including processing hardware and process space(s), and an associated storage system and database application (e.g., OODBMS or RDBMS) as is well known in the art. It is understood that “server system” and “server” are often used interchangeably herein. Similarly, the database object described herein can be implemented as single databases, a distributed database, a collection of distributed databases, a database with redundant online or offline backups or other redundancies, etc., and might include a distributed database or storage network and associated processing intelligence. -
FIG. 3 illustrates a block diagram of an embodiment of elements ofFIG. 2 and various possible interconnections between these elements.FIG. 3 also illustratesenvironment 210. However, inFIG. 3 , the elements ofsystem 216 and various interconnections in an embodiment are further illustrated.FIG. 3 shows thatuser system 212 may include aprocessor system 212A,memory system 212B,input system 212C, andoutput system 212D.FIG. 3 showsnetwork 214 andsystem 216.FIG. 3 also shows thatsystem 216 may includetenant data storage 222,tenant data 223,system data storage 224,system data 225, User Interface (UI) 330, Application Program Interface (API) 332 (e.g., a PreQL or JSON API), PL/SOQL 334, saveroutines 336,application setup mechanism 338, applications servers 300 1-300 N,system process space 302,tenant process spaces 304, tenantmanagement process space 310,tenant storage area 312,user storage 314, andapplication metadata 316. In other embodiments,environment 210 may not have the same elements as those listed above and/or may have other elements instead of, or in addition to, those listed above. -
User system 212,network 214,system 216,tenant data storage 222, andsystem data storage 224 were discussed above inFIG. 2 . As shown byFIG. 3 ,system 216 may include a network interface 220 (ofFIG. 2 ) implemented as a set ofHTTP application servers 300, anapplication platform 218,tenant data storage 222, andsystem data storage 224. Also shown issystem process space 302, including individualtenant process spaces 304 and a tenantmanagement process space 310. Eachapplication server 300 may be configured to tenantdata storage 222 and thetenant data 223 therein, andsystem data storage 224 and thesystem data 225 therein to serve requests ofuser systems 212. Thetenant data 223 might be divided into individualtenant storage areas 312, which can be either a physical arrangement and/or a logical arrangement of data. Within eachtenant storage area 312,user storage 314 andapplication metadata 316 might be similarly allocated for each user. For example, a copy of a user's most recently used (MRU) items might be stored touser storage 314. Similarly, a copy of MRU items for an entire organization that is a tenant might be stored to tenantstorage area 312. AUI 330 provides a user interface and an API 332 (e.g., a PreQL or JSON API) provides an application programmer interface tosystem 216 resident processes to users and/or developers atuser systems 212. The tenant data and the system data may be stored in various databases, such as one or more Oracle™ databases. -
Application platform 218 includes anapplication setup mechanism 338 that supports application developers' creation and management of applications, which may be saved as metadata intotenant data storage 222 by saveroutines 336 for execution by subscribers as one or moretenant process spaces 304 managed by tenantmanagement process space 310 for example. Invocations to such applications may be coded using PL/SOQL 334 that provides a programming language style interface extension to API 332 (e.g., a PreQL or JSON API). Invocations to applications may be detected by one or more system processes, which manages retrievingapplication metadata 316 for the subscriber making the invocation and executing the metadata as an application in a virtual machine. - Each
application server 300 may be communicably coupled to database systems, e.g., having access tosystem data 225 andtenant data 223, via a different network connection. For example, oneapplication server 300 1 might be coupled via the network 214 (e.g., the Internet), anotherapplication server 300 N-1 might be coupled via a direct network link, and anotherapplication server 300 N might be coupled by yet a different network connection. Transfer Control Protocol and Internet Protocol (TCP/IP) are typical protocols for communicating betweenapplication servers 300 and the database system. However, it will be apparent to one skilled in the art that other transport protocols may be used to optimize the system depending on the network interconnect used. - In certain embodiments, each
application server 300 is configured to handle requests for any user associated with any organization that is a tenant. Because it is desirable to be able to add and remove application servers from the server pool at any time for any reason, there is preferably no server affinity for a user and/or organization to aspecific application server 300. In one embodiment, therefore, an interface system implementing a load balancing function (e.g., an F5 Big-IP load balancer) is communicably coupled between theapplication servers 300 and theuser systems 212 to distribute requests to theapplication servers 300. In one embodiment, the load balancer uses a least connections algorithm to route user requests to theapplication servers 300. Other examples of load balancing algorithms, such as round robin and observed response time, also can be used. For example, in certain embodiments, three consecutive requests from the same user may hit threedifferent application servers 300, and three requests from different users may hit thesame application server 300. In this manner,system 216 is multi-tenant, in whichsystem 216 handles storage of, and access to, different objects, data and applications across disparate users and organizations. - As an example of storage, one tenant might be a company that employs a sales force where each salesperson uses
system 216 to manage their sales process. Thus, a user might maintain contact data, leads data, customer follow-up data, performance data, goals and progress data, etc., all applicable to that user's personal sales process (e.g., in tenant data storage 222). In an example of a MTS arrangement, since all of the data and the applications to access, view, modify, report, transmit, calculate, etc., can be maintained and accessed by a user system having nothing more than network access, the user can manage his or her sales efforts and cycles from any of many different user systems. For example, if a salesperson is visiting a customer and the customer has Internet access in their lobby, the salesperson can obtain critical updates as to that customer while waiting for the customer to arrive in the lobby. - While each user's data might be separate from other users' data regardless of the employers of each user, some data might be organization-wide data shared or accessible by a plurality of users or all of the users for a given organization that is a tenant. Thus, there might be some data structures managed by
system 216 that are allocated at the tenant level while other data structures might be managed at the user level. Because an MTS might support multiple tenants including possible competitors, the MTS may have security protocols that keep data, applications, and application use separate. Also, because many tenants may opt for access to an MTS rather than maintain their own system, redundancy, up-time, and backup are additional functions that may be implemented in the MTS. In addition to user-specific data and tenant specific data,system 216 might also maintain system level data usable by multiple tenants or other data. Such system level data might include industry reports, news, postings, and the like that are sharable among tenants. - In certain embodiments, user systems 212 (which may be client systems) communicate with
application servers 300 to request and update system-level and tenant-level data fromsystem 216 that may require sending one or more queries to tenantdata storage 222 and/orsystem data storage 224. System 216 (e.g., anapplication server 300 in system 216) automatically generates one or more SQL statements or PreQL statements (e.g., one or more SQL or PreQL queries respectively) that are designed to access the desired information.System data storage 224 may generate query plans to access the requested data from the database. - Each database can generally be viewed as a collection of objects, such as a set of logical tables, containing data fitted into predefined categories. A “table” is one representation of a data object, and may be used herein to simplify the conceptual description of objects and custom objects as described herein. It is understood that “table” and “object” may be used interchangeably herein. Each table generally contains one or more data categories logically arranged as columns or fields in a viewable schema. Each row or record of a table contains an instance of data for each category defined by the fields. For example, a CRM database may include a table that describes a customer with fields for basic contact information such as name, address, phone number, fax number, etc. Another table might describe a purchase order, including fields for information such as customer, product, sale price, date, etc. In some multi-tenant database systems, standard entity tables might be provided for use by all tenants. For CRM database applications, such standard entities might include tables for Account, Contact, Lead, and Opportunity data, each containing pre-defined fields. It is understood that the word “entity” may also be used interchangeably herein with “object” and “table.”
- In some multi-tenant database systems, tenants may be allowed to create and store custom objects, or they may be allowed to customize standard entities or objects, for example by creating custom fields for standard objects, including custom index fields. In certain embodiments, for example, all custom entity data rows are stored in a single multi-tenant physical table, which may contain multiple logical tables per organization. It is transparent to customers that their multiple “tables” are in fact stored in one large table or that their data may be stored in the same table as the data of other customers.
-
FIG. 4 illustrates a diagrammatic representation of amachine 400 in the exemplary form of a computer system, in accordance with one embodiment, within which a set of instructions, for causing the machine/computer system 400 to perform any one or more of the methodologies discussed herein, may be executed. In alternative embodiments, the machine may be connected (e.g., networked) to other machines in a Local Area Network (LAN), an intranet, an extranet, or the public Internet. The machine may operate in the capacity of a server or a client machine in a client-server network environment, as a peer machine in a peer-to-peer (or distributed) network environment, as a server or series of servers within an on-demand service environment. Certain embodiments of the machine may be in the form of a personal computer (PC), a tablet PC, a set-top box (STB), a Personal Digital Assistant (PDA), a cellular telephone, a web appliance, a server, a network router, switch or bridge, computing system, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine. Further, while only a single machine is illustrated, the term “machine” shall also be taken to include any collection of machines (e.g., computers) that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein. - The
exemplary computer system 400 includes aprocessor 402, a main memory 404 (e.g., read-only memory (ROM), flash memory, dynamic random access memory (DRAM) such as synchronous DRAM (SDRAM) or Rambus DRAM (RDRAM), etc., static memory such as flash memory, static random access memory (SRAM), volatile but high-data rate RAM, etc.), and a secondary memory 418 (e.g., a persistent storage device including hard disk drives and a persistent database and/or a multi-tenant database implementation), which communicate with each other via abus 430.Main memory 404 includes storedindices 424, ananalysis engine 423, and aPreQL API 425.Main memory 404 and its sub-elements are operable in conjunction withprocessing logic 426 andprocessor 402 to perform the methodologies discussed herein. Thecomputer system 400 may additionally or alternatively embody the server side elements as described above. -
Processor 402 represents one or more general-purpose processing devices such as a microprocessor, central processing unit, or the like. More particularly, theprocessor 402 may be a complex instruction set computing (CISC) microprocessor, reduced instruction set computing (RISC) microprocessor, very long instruction word (VLIW) microprocessor, processor implementing other instruction sets, or processors implementing a combination of instruction sets.Processor 402 may also be one or more special-purpose processing devices such as an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), a digital signal processor (DSP), network processor, or the like.Processor 402 is configured to execute theprocessing logic 426 for performing the operations and functionality which is discussed herein. - The
computer system 400 may further include anetwork interface card 408. Thecomputer system 400 also may include a user interface 410 (such as a video display unit, a liquid crystal display (LCD), or a cathode ray tube (CRT)), an alphanumeric input device 412 (e.g., a keyboard), a cursor control device 414 (e.g., a mouse), and a signal generation device 416 (e.g., an integrated speaker). Thecomputer system 400 may further include peripheral device 436 (e.g., wireless or wired communication devices, memory devices, storage devices, audio processing devices, video processing devices, etc.). - The
secondary memory 418 may include a non-transitory machine-readable or computerreadable storage medium 431 on which is stored one or more sets of instructions (e.g., software 422) embodying any one or more of the methodologies or functions described herein. Thesoftware 422 may also reside, completely or at least partially, within themain memory 404 and/or within theprocessor 402 during execution thereof by thecomputer system 400, themain memory 404 and theprocessor 402 also constituting machine-readable storage media. Thesoftware 422 may further be transmitted or received over anetwork 420 via thenetwork interface card 408. -
FIG. 5A depicts a tablet computing device and a hand-held smartphone each having a circuitry integrated therein as described in accordance with the embodiments. -
FIG. 5B is a block diagram of an embodiment of tablet computing device, a smart phone, or other mobile device in which touchscreen interface connectors are used. - While the subject matter disclosed herein has been described by way of example and in terms of the specific embodiments, it is to be understood that the claimed embodiments are not limited to the explicitly enumerated embodiments disclosed. To the contrary, the disclosure is intended to cover various modifications and similar arrangements as are apparent to those skilled in the art. Therefore, the scope of the appended claims are to be accorded the broadest interpretation so as to encompass all such modifications and similar arrangements. It is to be understood that the above description is intended to be illustrative, and not restrictive. Many other embodiments will be apparent to those of skill in the art upon reading and understanding the above description. The scope of the disclosed subject matter is therefore to be determined in reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.
Claims (7)
1. A method comprising:
receiving a dataset in tabular form of columns and rows;
triggering Veritable core analysis of the dataset received;
identifying hidden structure of the dataset via the Veritable core analysis, the hidden structure including one or more of relationships and causations in the data for which such relationships and causations are not pre-defined by the dataset;
storing the analyzed dataset including the hidden structure having the one or more of relationships and causations as a queryable model.
2. The method of claim 1 , further comprising:
querying the stored queryable model for predictive analysis.
3. The method of claim 1 , further comprising one or more of:
issuing a PreQL structure query against the queryable model, the PreQL structure comprising one of:
a PREDICT term;
a RELATED term;
a SIMILAR term; and
a GROUP term.
4. The method of claim 1 , further comprising:
providing a graphical user interface to the Veritable core as a cloud based computing service.
5. The method of claim 1 , further comprising:
providing a perceptible GUI as a cloud based service, the perceptible GUI accepting as input a data source within the predictive database;
presenting at the perceptible GUI, a table representing the data source within the predictive database, wherein the table has a plurality of non-null values and a plurality of null values;
providing a graphical slider mechanism at the perceptible GUI, wherein the graphical slider mechanism is manipulatable at a client device to increase and decrease a percentage of predictive fill for the null values of the table;
populating null values of the table at the graphical perceptible GUI responsive to the graphical slider mechanism registering an increase in value to populate by the client device.
6. The method of claim 5 , wherein populating the null values comprises:
for every null value cell element in the data, retrieving a distribution via Veritable API calls from the Veritable core;
correlating a percentage fill value registered by the graphical slider mechanism to a necessary confidence threshold to reach the requested percentage fill; and
populating null values of the table at the graphical perceptible GUI until the percentage fill value is reached by selecting cell elements for which the corresponding distribution has a confidence in excess of the confidence threshold.
7. The method of claim 6 , further comprising:
receiving a 100% fill value request from the graphical slider mechanism;
populating all null values of the table at the graphical perceptible GUI by degrading required confidence until a predicted result is available for every null value of the table.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US14/014,204 US20140280065A1 (en) | 2013-03-13 | 2013-08-29 | Systems and methods for predictive query implementation and usage in a multi-tenant database system |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201361780503P | 2013-03-13 | 2013-03-13 | |
| US14/014,204 US20140280065A1 (en) | 2013-03-13 | 2013-08-29 | Systems and methods for predictive query implementation and usage in a multi-tenant database system |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| US20140280065A1 true US20140280065A1 (en) | 2014-09-18 |
Family
ID=51532089
Family Applications (13)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US14/014,225 Active 2034-03-28 US9342836B2 (en) | 2013-03-13 | 2013-08-29 | Systems, methods, and apparatuses for implementing a predict command with a predictive query interface |
| US14/014,204 Abandoned US20140280065A1 (en) | 2013-03-13 | 2013-08-29 | Systems and methods for predictive query implementation and usage in a multi-tenant database system |
| US14/014,258 Active 2034-07-02 US9240016B2 (en) | 2013-03-13 | 2013-08-29 | Systems, methods, and apparatuses for implementing predictive query interface as a cloud service |
| US14/014,241 Active 2034-01-17 US9336533B2 (en) | 2013-03-13 | 2013-08-29 | Systems, methods, and apparatuses for implementing a similar command with a predictive query interface |
| US14/014,264 Active 2034-06-13 US9235846B2 (en) | 2013-03-13 | 2013-08-29 | Systems, methods, and apparatuses for populating a table having null values using a predictive query interface |
| US14/014,250 Active 2034-06-01 US9349132B2 (en) | 2013-03-13 | 2013-08-29 | Systems, methods, and apparatuses for implementing a group command with a predictive query interface |
| US14/014,221 Active 2034-04-01 US9367853B2 (en) | 2013-03-13 | 2013-08-29 | Systems, methods, and apparatuses for implementing data upload, processing, and predictive query API exposure |
| US14/014,236 Active 2034-05-14 US9454767B2 (en) | 2013-03-13 | 2013-08-29 | Systems, methods, and apparatuses for implementing a related command with a predictive query interface |
| US14/014,269 Active 2034-04-22 US9390428B2 (en) | 2013-03-13 | 2013-08-29 | Systems, methods, and apparatuses for rendering scored opportunities using a predictive query interface |
| US14/014,271 Active 2035-03-16 US10860557B2 (en) | 2013-03-13 | 2013-08-29 | Systems, methods, and apparatuses for implementing change value indication and historical value comparison |
| US14/992,925 Active US9753962B2 (en) | 2013-03-13 | 2016-01-11 | Systems, methods, and apparatuses for populating a table having null values using a predictive query interface |
| US15/181,256 Active US9690815B2 (en) | 2013-03-13 | 2016-06-13 | Systems, methods, and apparatuses for implementing data upload, processing, and predictive query API exposure |
| US15/249,026 Active US10963541B2 (en) | 2013-03-13 | 2016-08-26 | Systems, methods, and apparatuses for implementing a related command with a predictive query interface |
Family Applications Before (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US14/014,225 Active 2034-03-28 US9342836B2 (en) | 2013-03-13 | 2013-08-29 | Systems, methods, and apparatuses for implementing a predict command with a predictive query interface |
Family Applications After (11)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US14/014,258 Active 2034-07-02 US9240016B2 (en) | 2013-03-13 | 2013-08-29 | Systems, methods, and apparatuses for implementing predictive query interface as a cloud service |
| US14/014,241 Active 2034-01-17 US9336533B2 (en) | 2013-03-13 | 2013-08-29 | Systems, methods, and apparatuses for implementing a similar command with a predictive query interface |
| US14/014,264 Active 2034-06-13 US9235846B2 (en) | 2013-03-13 | 2013-08-29 | Systems, methods, and apparatuses for populating a table having null values using a predictive query interface |
| US14/014,250 Active 2034-06-01 US9349132B2 (en) | 2013-03-13 | 2013-08-29 | Systems, methods, and apparatuses for implementing a group command with a predictive query interface |
| US14/014,221 Active 2034-04-01 US9367853B2 (en) | 2013-03-13 | 2013-08-29 | Systems, methods, and apparatuses for implementing data upload, processing, and predictive query API exposure |
| US14/014,236 Active 2034-05-14 US9454767B2 (en) | 2013-03-13 | 2013-08-29 | Systems, methods, and apparatuses for implementing a related command with a predictive query interface |
| US14/014,269 Active 2034-04-22 US9390428B2 (en) | 2013-03-13 | 2013-08-29 | Systems, methods, and apparatuses for rendering scored opportunities using a predictive query interface |
| US14/014,271 Active 2035-03-16 US10860557B2 (en) | 2013-03-13 | 2013-08-29 | Systems, methods, and apparatuses for implementing change value indication and historical value comparison |
| US14/992,925 Active US9753962B2 (en) | 2013-03-13 | 2016-01-11 | Systems, methods, and apparatuses for populating a table having null values using a predictive query interface |
| US15/181,256 Active US9690815B2 (en) | 2013-03-13 | 2016-06-13 | Systems, methods, and apparatuses for implementing data upload, processing, and predictive query API exposure |
| US15/249,026 Active US10963541B2 (en) | 2013-03-13 | 2016-08-26 | Systems, methods, and apparatuses for implementing a related command with a predictive query interface |
Country Status (6)
| Country | Link |
|---|---|
| US (13) | US9342836B2 (en) |
| EP (1) | EP2973004A1 (en) |
| JP (2) | JP6412550B2 (en) |
| CN (2) | CN110309119B (en) |
| CA (1) | CA2904526C (en) |
| WO (1) | WO2014143208A1 (en) |
Cited By (27)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20150310021A1 (en) * | 2014-04-28 | 2015-10-29 | International Business Machines Corporation | Big data analytics brokerage |
| US20150327135A1 (en) * | 2014-04-24 | 2015-11-12 | Futurewei Technologies, Inc. | Apparatus and method for dynamic hybrid routing in sdn networks to avoid congestion and balance loads under changing traffic load |
| US20160110362A1 (en) * | 2014-10-20 | 2016-04-21 | International Business Machines Corporation | Automatic enumeration of data analysis options and rapid analysis of statistical models |
| US9418106B1 (en) * | 2015-10-19 | 2016-08-16 | International Business Machines Corporation | Joining operations in document oriented databases |
| US20160292216A1 (en) * | 2015-04-01 | 2016-10-06 | International Business Machines Corporation | Supporting multi-tenant applications on a shared database using pre-defined attributes |
| US20170351752A1 (en) * | 2016-06-07 | 2017-12-07 | Panoramix Solutions | Systems and methods for identifying and classifying text |
| US20180060324A1 (en) * | 2016-08-26 | 2018-03-01 | International Business Machines Corporation | Parallel scoring of an ensemble model |
| US20180260501A1 (en) * | 2017-03-10 | 2018-09-13 | General Electric Company | Systems and methods for overlaying and integrating computer aided design (cad) drawings with fluid models |
| US10179282B2 (en) | 2016-02-26 | 2019-01-15 | Impyrium, Inc. | Joystick input apparatus with living hinges |
| US20190042932A1 (en) * | 2017-08-01 | 2019-02-07 | Salesforce Com, Inc. | Techniques and Architectures for Deep Learning to Support Security Threat Detection |
| US10438126B2 (en) * | 2015-12-31 | 2019-10-08 | General Electric Company | Systems and methods for data estimation and forecasting |
| WO2020006567A1 (en) * | 2018-06-29 | 2020-01-02 | Security On-Demand, Inc. | Systems and methods for intelligent capture and fast transformations of granulated data summaries in database engines |
| US10650114B2 (en) | 2017-03-10 | 2020-05-12 | Ge Aviation Systems Llc | Systems and methods for utilizing a 3D CAD point-cloud to automatically create a fluid model |
| WO2020118432A1 (en) * | 2018-12-13 | 2020-06-18 | Element Ai Inc. | Data set access for updating machine learning models |
| US10803211B2 (en) | 2017-03-10 | 2020-10-13 | General Electric Company | Multiple fluid model tool for interdisciplinary fluid modeling |
| WO2021024205A1 (en) * | 2019-08-06 | 2021-02-11 | Bosman Philippus Johannes | Method and system of optimizing stock availability and sales opportunity |
| US10922362B2 (en) * | 2018-07-06 | 2021-02-16 | Clover Health | Models for utilizing siloed data |
| US10942911B2 (en) | 2015-03-20 | 2021-03-09 | D&B Business Information Solutions | Aggregating high volumes of temporal data from multiple overlapping sources |
| US10977397B2 (en) | 2017-03-10 | 2021-04-13 | Altair Engineering, Inc. | Optimization of prototype and machine design within a 3D fluid modeling environment |
| US11004568B2 (en) | 2017-03-10 | 2021-05-11 | Altair Engineering, Inc. | Systems and methods for multi-dimensional fluid modeling of an organism or organ |
| US11205103B2 (en) | 2016-12-09 | 2021-12-21 | The Research Foundation for the State University | Semisupervised autoencoder for sentiment analysis |
| US20220121685A1 (en) * | 2020-10-16 | 2022-04-21 | Salesforce.Com, Inc. | Generating a query using training observations |
| US11429627B2 (en) | 2018-09-28 | 2022-08-30 | Splunk Inc. | System monitoring driven by automatically determined operational parameters of dependency graph model with user interface |
| US20220277327A1 (en) * | 2021-02-26 | 2022-09-01 | Capital One Services, Llc | Computer-based systems for data distribution allocation utilizing machine learning models and methods of use thereof |
| US11468505B1 (en) | 2018-06-12 | 2022-10-11 | Wells Fargo Bank, N.A. | Computer-based systems for calculating risk of asset transfers |
| US11620300B2 (en) * | 2018-09-28 | 2023-04-04 | Splunk Inc. | Real-time measurement and system monitoring based on generated dependency graph models of system components |
| US12013826B2 (en) | 2020-11-17 | 2024-06-18 | Coupang Corp. | Systems and methods for database query efficiency improvement |
Families Citing this family (351)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8285719B1 (en) * | 2008-08-08 | 2012-10-09 | The Research Foundation Of State University Of New York | System and method for probabilistic relational clustering |
| US8694593B1 (en) * | 2011-03-31 | 2014-04-08 | Google Inc. | Tools for micro-communities |
| US10891270B2 (en) | 2015-12-04 | 2021-01-12 | Mongodb, Inc. | Systems and methods for modelling virtual schemas in non-relational databases |
| US9589000B2 (en) * | 2012-08-30 | 2017-03-07 | Atheer, Inc. | Method and apparatus for content association and history tracking in virtual and augmented reality |
| US10185294B2 (en) * | 2012-09-25 | 2019-01-22 | Nec Corporation | Voltage control device and method for controlling the same |
| US10318901B2 (en) * | 2013-03-15 | 2019-06-11 | Connectwise, Llc | Systems and methods for business management using product data with product classes |
| US20140316953A1 (en) * | 2013-04-17 | 2014-10-23 | Vmware, Inc. | Determining datacenter costs |
| US20150032508A1 (en) * | 2013-07-29 | 2015-01-29 | International Business Machines Corporation | Systems And Methods For Probing Customers And Making Offers In An Interactive Setting |
| US9367806B1 (en) | 2013-08-08 | 2016-06-14 | Jasmin Cosic | Systems and methods of using an artificially intelligent database management system and interfaces for mobile, embedded, and other computing devices |
| US10223401B2 (en) | 2013-08-15 | 2019-03-05 | International Business Machines Corporation | Incrementally retrieving data for objects to provide a desired level of detail |
| US9442963B2 (en) * | 2013-08-27 | 2016-09-13 | Omnitracs, Llc | Flexible time-based aggregated derivations for advanced analytics |
| US20150073844A1 (en) * | 2013-09-06 | 2015-03-12 | International Business Machines Corporation | Generating multiply constrained globally optimized requests for proposal packages subject to uncertainty across multiple time horizons |
| US10437805B2 (en) * | 2013-09-24 | 2019-10-08 | Qliktech International Ab | Methods and systems for data management and analysis |
| US9727915B2 (en) * | 2013-09-26 | 2017-08-08 | Trading Technologies International, Inc. | Methods and apparatus to implement spin-gesture based trade action parameter selection |
| US20150142787A1 (en) * | 2013-11-19 | 2015-05-21 | Kurt L. Kimmerling | Method and system for search refinement |
| US9767196B1 (en) * | 2013-11-20 | 2017-09-19 | Google Inc. | Content selection |
| US20150170163A1 (en) * | 2013-12-17 | 2015-06-18 | Sap Ag | System and method for calculating and visualizing relevance of sales opportunities |
| US11386085B2 (en) | 2014-01-27 | 2022-07-12 | Microstrategy Incorporated | Deriving metrics from queries |
| US11921715B2 (en) | 2014-01-27 | 2024-03-05 | Microstrategy Incorporated | Search integration |
| US10255320B1 (en) * | 2014-01-27 | 2019-04-09 | Microstrategy Incorporated | Search integration |
| US11004146B1 (en) * | 2014-01-31 | 2021-05-11 | Intuit Inc. | Business health score and prediction of credit worthiness using credit worthiness of customers and vendors |
| US10096040B2 (en) | 2014-01-31 | 2018-10-09 | Walmart Apollo, Llc | Management of the display of online ad content consistent with one or more performance objectives for a webpage and/or website |
| US20170024659A1 (en) * | 2014-03-26 | 2017-01-26 | Bae Systems Information And Electronic Systems Integration Inc. | Method for data searching by learning and generalizing relational concepts from a few positive examples |
| KR20150138594A (en) * | 2014-05-30 | 2015-12-10 | 한국전자통신연구원 | Apparatus and Method for Providing multi-view UI generation for client devices of cloud game services |
| US10572488B2 (en) * | 2014-06-13 | 2020-02-25 | Koverse, Inc. | System and method for data organization, optimization and analytics |
| US10262048B1 (en) | 2014-07-07 | 2019-04-16 | Microstrategy Incorporated | Optimization of memory analytics |
| US10572935B1 (en) * | 2014-07-16 | 2020-02-25 | Intuit, Inc. | Disambiguation of entities based on financial interactions |
| US10380266B2 (en) * | 2014-08-11 | 2019-08-13 | InMobi Pte Ltd. | Method and system for analyzing data in a database |
| US10296662B2 (en) * | 2014-09-22 | 2019-05-21 | Ca, Inc. | Stratified sampling of log records for approximate full-text search |
| US10387494B2 (en) | 2014-09-24 | 2019-08-20 | Oracle International Corporation | Guided data exploration |
| US10762456B2 (en) * | 2014-09-30 | 2020-09-01 | International Business Machines Corporation | Migration estimation with partial data |
| US10037545B1 (en) * | 2014-12-08 | 2018-07-31 | Quantcast Corporation | Predicting advertisement impact for audience selection |
| US11636408B2 (en) * | 2015-01-22 | 2023-04-25 | Visier Solutions, Inc. | Techniques for manipulating and rearranging presentation of workforce data in accordance with different data-prediction scenarios available within a graphical user interface (GUI) of a computer system, and an apparatus and hardware memory implementing the techniques |
| US10402759B2 (en) * | 2015-01-22 | 2019-09-03 | Visier Solutions, Inc. | Systems and methods of adding and reconciling dimension members |
| US9881265B2 (en) * | 2015-01-30 | 2018-01-30 | Oracle International Corporation | Method and system for implementing historical trending for business records |
| US9971469B2 (en) | 2015-01-30 | 2018-05-15 | Oracle International Corporation | Method and system for presenting business intelligence information through infolets |
| US9971803B2 (en) | 2015-01-30 | 2018-05-15 | Oracle International Corporation | Method and system for embedding third party data into a SaaS business platform |
| US20160253680A1 (en) * | 2015-02-26 | 2016-09-01 | Ncr Corporation | Real-time inter and intra outlet trending |
| US9135559B1 (en) | 2015-03-20 | 2015-09-15 | TappingStone Inc. | Methods and systems for predictive engine evaluation, tuning, and replay of engine performance |
| US10713594B2 (en) | 2015-03-20 | 2020-07-14 | Salesforce.Com, Inc. | Systems, methods, and apparatuses for implementing machine learning model training and deployment with a rollback mechanism |
| US11443206B2 (en) | 2015-03-23 | 2022-09-13 | Tibco Software Inc. | Adaptive filtering and modeling via adaptive experimental designs to identify emerging data patterns from large volume, high dimensional, high velocity streaming data |
| US10671603B2 (en) * | 2016-03-11 | 2020-06-02 | Tibco Software Inc. | Auto query construction for in-database predictive analytics |
| US10614056B2 (en) * | 2015-03-24 | 2020-04-07 | NetSuite Inc. | System and method for automated detection of incorrect data |
| US10679228B2 (en) * | 2015-03-30 | 2020-06-09 | Walmart Apollo, Llc | Systems, devices, and methods for predicting product performance in a retail display area |
| WO2016168211A1 (en) * | 2015-04-13 | 2016-10-20 | Risk Management Solutions, Inc. | High performance big data computing system and platform |
| US10019542B2 (en) * | 2015-04-14 | 2018-07-10 | Ptc Inc. | Scoring a population of examples using a model |
| US10817544B2 (en) | 2015-04-20 | 2020-10-27 | Splunk Inc. | Scaling available storage based on counting generated events |
| US10282455B2 (en) | 2015-04-20 | 2019-05-07 | Splunk Inc. | Display of data ingestion information based on counting generated events |
| US10120375B2 (en) * | 2015-04-23 | 2018-11-06 | Johnson Controls Technology Company | Systems and methods for retraining outlier detection limits in a building management system |
| US10963795B2 (en) * | 2015-04-28 | 2021-03-30 | International Business Machines Corporation | Determining a risk score using a predictive model and medical model data |
| EP3258672B1 (en) * | 2015-04-30 | 2024-10-23 | Huawei Technologies Co., Ltd. | Cloud file transmission method, terminal and cloud server |
| WO2016179416A1 (en) * | 2015-05-05 | 2016-11-10 | Goyal Bharat | Predictive modeling and analytics integration plataform |
| US20160335542A1 (en) * | 2015-05-12 | 2016-11-17 | Dell Software, Inc. | Method And Apparatus To Perform Native Distributed Analytics Using Metadata Encoded Decision Engine In Real Time |
| US11023462B2 (en) | 2015-05-14 | 2021-06-01 | Deephaven Data Labs, LLC | Single input graphical user interface control element and method |
| US10740292B2 (en) | 2015-05-18 | 2020-08-11 | Interactive Data Pricing And Reference Data Llc | Data conversion and distribution systems |
| AU2016271110B2 (en) * | 2015-05-29 | 2021-08-05 | Bytedance Inc. | Mobile search |
| CN104881749A (en) * | 2015-06-01 | 2015-09-02 | 北京圆通慧达管理软件开发有限公司 | Data management method and data storage system for multiple tenants |
| US10740129B2 (en) | 2015-06-05 | 2020-08-11 | International Business Machines Corporation | Distinguishing portions of output from multiple hosts |
| US9384203B1 (en) * | 2015-06-09 | 2016-07-05 | Palantir Technologies Inc. | Systems and methods for indexing and aggregating data records |
| WO2017003943A1 (en) * | 2015-06-29 | 2017-01-05 | Wal-Mart Stores, Inc. | Refrigerating home deliveries |
| US10102308B1 (en) | 2015-06-30 | 2018-10-16 | Groupon, Inc. | Method and apparatus for identifying related records |
| US10587671B2 (en) * | 2015-07-09 | 2020-03-10 | Zscaler, Inc. | Systems and methods for tracking and auditing changes in a multi-tenant cloud system |
| US10318864B2 (en) * | 2015-07-24 | 2019-06-11 | Microsoft Technology Licensing, Llc | Leveraging global data for enterprise data analytics |
| US9443192B1 (en) * | 2015-08-30 | 2016-09-13 | Jasmin Cosic | Universal artificial intelligence engine for autonomous computing devices and software applications |
| US10579687B2 (en) * | 2015-09-01 | 2020-03-03 | Google Llc | Providing native application search results with web search results |
| US10296833B2 (en) * | 2015-09-04 | 2019-05-21 | International Business Machines Corporation | System and method for estimating missing attributes of future events |
| US10803399B1 (en) * | 2015-09-10 | 2020-10-13 | EMC IP Holding Company LLC | Topic model based clustering of text data with machine learning utilizing interface feedback |
| US10216792B2 (en) * | 2015-10-14 | 2019-02-26 | Paxata, Inc. | Automated join detection |
| US10628749B2 (en) * | 2015-11-17 | 2020-04-21 | International Business Machines Corporation | Automatically assessing question answering system performance across possible confidence values |
| US10282678B2 (en) | 2015-11-18 | 2019-05-07 | International Business Machines Corporation | Automated similarity comparison of model answers versus question answering system output |
| US10445650B2 (en) | 2015-11-23 | 2019-10-15 | Microsoft Technology Licensing, Llc | Training and operating multi-layer computational models |
| US20170286532A1 (en) * | 2015-12-04 | 2017-10-05 | Eliot Horowitz | System and method for generating visual queries in non-relational databases |
| US11157465B2 (en) | 2015-12-04 | 2021-10-26 | Mongodb, Inc. | System and interfaces for performing document validation in a non-relational database |
| US11537667B2 (en) | 2015-12-04 | 2022-12-27 | Mongodb, Inc. | System and interfaces for performing document validation in a non-relational database |
| US9836444B2 (en) * | 2015-12-10 | 2017-12-05 | International Business Machines Corporation | Spread cell value visualization |
| US20170186018A1 (en) * | 2015-12-29 | 2017-06-29 | At&T Intellectual Property I, L.P. | Method and apparatus to create a customer care service |
| US10755221B2 (en) * | 2015-12-29 | 2020-08-25 | Workfusion, Inc. | Worker answer confidence estimation for worker assessment |
| US10762539B2 (en) * | 2016-01-27 | 2020-09-01 | Amobee, Inc. | Resource estimation for queries in large-scale distributed database system |
| US10285001B2 (en) | 2016-02-26 | 2019-05-07 | Snap Inc. | Generation, curation, and presentation of media collections |
| US11023514B2 (en) * | 2016-02-26 | 2021-06-01 | Snap Inc. | Methods and systems for generation, curation, and presentation of media collections |
| US10929815B2 (en) * | 2016-03-14 | 2021-02-23 | Buildgroup Data Services Inc. | Adaptive and reusable processing of retroactive sequences for automated predictions |
| US10055263B2 (en) * | 2016-04-01 | 2018-08-21 | Ebay Inc. | Optimization of parallel processing using waterfall representations |
| US10929370B2 (en) * | 2016-04-14 | 2021-02-23 | International Business Machines Corporation | Index maintenance management of a relational database management system |
| US10585874B2 (en) * | 2016-04-25 | 2020-03-10 | International Business Machines Corporation | Locking concurrent commands in a database management system |
| USD786277S1 (en) * | 2016-04-29 | 2017-05-09 | Salesforce.Com, Inc. | Display screen or portion thereof with animated graphical user interface |
| US9785715B1 (en) | 2016-04-29 | 2017-10-10 | Conversable, Inc. | Systems, media, and methods for automated response to queries made by interactive electronic chat |
| USD786276S1 (en) * | 2016-04-29 | 2017-05-09 | Salesforce.Com, Inc. | Display screen or portion thereof with animated graphical user interface |
| USD786896S1 (en) * | 2016-04-29 | 2017-05-16 | Salesforce.Com, Inc. | Display screen or portion thereof with animated graphical user interface |
| US11194864B2 (en) * | 2016-05-10 | 2021-12-07 | Aircloak Gmbh | Systems and methods for anonymized statistical database queries |
| US11194823B2 (en) | 2016-05-10 | 2021-12-07 | Aircloak Gmbh | Systems and methods for anonymized statistical database queries using noise elements |
| US9965650B1 (en) * | 2016-05-11 | 2018-05-08 | MDClone Ltd. | Computer system of computer servers and dedicated computer clients specially programmed to generate synthetic non-reversible electronic data records based on real-time electronic querying and methods of use thereof |
| US10607146B2 (en) * | 2016-06-02 | 2020-03-31 | International Business Machines Corporation | Predicting user question in question and answer system |
| US10515085B2 (en) | 2016-06-19 | 2019-12-24 | Data.World, Inc. | Consolidator platform to implement collaborative datasets via distributed computer networks |
| EP3472718A4 (en) * | 2016-06-19 | 2020-04-01 | Data.world, Inc. | CONSOLIDATION OF COLLABORATIVE DATA SETS VIA DISTRIBUTED COMPUTER NETWORKS |
| US10740328B2 (en) * | 2016-06-24 | 2020-08-11 | Microsoft Technology Licensing, Llc | Aggregate-query database system and processing |
| CN106156423B (en) * | 2016-07-01 | 2019-07-12 | 合肥海本蓝科技有限公司 | A kind of method and apparatus realizing test platform and being communicated with user's trial-ray method to be measured |
| US10623406B2 (en) * | 2016-07-22 | 2020-04-14 | Box, Inc. | Access authentication for cloud-based shared content |
| US10216782B2 (en) * | 2016-08-12 | 2019-02-26 | Sap Se | Processing of updates in a database system using different scenarios |
| US10521572B2 (en) | 2016-08-16 | 2019-12-31 | Lexisnexis Risk Solutions Inc. | Systems and methods for improving KBA identity authentication questions |
| US9864933B1 (en) | 2016-08-23 | 2018-01-09 | Jasmin Cosic | Artificially intelligent systems, devices, and methods for learning and/or using visual surrounding for autonomous object operation |
| WO2018038719A1 (en) * | 2016-08-24 | 2018-03-01 | Halliburton Energy Services, Inc. | Platform services with customer data access |
| GB201615745D0 (en) * | 2016-09-15 | 2016-11-02 | Gb Gas Holdings Ltd | System for analysing data relationships to support query execution |
| US11625662B2 (en) * | 2016-09-22 | 2023-04-11 | Qvinci Software, Llc | Methods and apparatus for the manipulating and providing of anonymized data collected from a plurality of sources |
| US10296659B2 (en) * | 2016-09-26 | 2019-05-21 | International Business Machines Corporation | Search query intent |
| US20180089585A1 (en) * | 2016-09-29 | 2018-03-29 | Salesforce.Com, Inc. | Machine learning model for predicting state of an object representing a potential transaction |
| US11386336B2 (en) * | 2016-10-06 | 2022-07-12 | The Dun And Bradstreet Corporation | Machine learning classifier and prediction engine for artificial intelligence optimized prospect determination on win/loss classification |
| US10510088B2 (en) | 2016-10-07 | 2019-12-17 | Bank Of America Corporation | Leveraging an artificial intelligence engine to generate customer-specific user experiences based on real-time analysis of customer responses to recommendations |
| US10614517B2 (en) | 2016-10-07 | 2020-04-07 | Bank Of America Corporation | System for generating user experience for improving efficiencies in computing network functionality by specializing and minimizing icon and alert usage |
| US10621558B2 (en) | 2016-10-07 | 2020-04-14 | Bank Of America Corporation | System for automatically establishing an operative communication channel to transmit instructions for canceling duplicate interactions with third party systems |
| US20180101900A1 (en) * | 2016-10-07 | 2018-04-12 | Bank Of America Corporation | Real-time dynamic graphical representation of resource utilization and management |
| US10476974B2 (en) | 2016-10-07 | 2019-11-12 | Bank Of America Corporation | System for automatically establishing operative communication channel with third party computing systems for subscription regulation |
| WO2018080857A1 (en) * | 2016-10-28 | 2018-05-03 | Panoptex Technologies, Inc. | Systems and methods for creating, storing, and analyzing secure data |
| US10452974B1 (en) | 2016-11-02 | 2019-10-22 | Jasmin Cosic | Artificially intelligent systems, devices, and methods for learning and/or using a device's circumstances for autonomous device operation |
| US10474339B2 (en) | 2016-11-04 | 2019-11-12 | Sismo Sas | System and method for market visualization |
| US11188551B2 (en) * | 2016-11-04 | 2021-11-30 | Microsoft Technology Licensing, Llc | Multi-level data pagination |
| US10482248B2 (en) * | 2016-11-09 | 2019-11-19 | Cylance Inc. | Shellcode detection |
| US10536536B1 (en) | 2016-11-15 | 2020-01-14 | State Farm Mutual Automobile Insurance Company | Resource discovery agent computing device, software application, and method |
| US20180150879A1 (en) * | 2016-11-25 | 2018-05-31 | Criteo Sa | Automatic selection of items for a computerized graphical advertisement display using a computer-generated multidimensional vector space |
| CN106708946A (en) * | 2016-11-25 | 2017-05-24 | 国云科技股份有限公司 | A Table Query Method of General API |
| US10607134B1 (en) | 2016-12-19 | 2020-03-31 | Jasmin Cosic | Artificially intelligent systems, devices, and methods for learning and/or using an avatar's circumstances for autonomous avatar operation |
| US11710089B2 (en) * | 2016-12-22 | 2023-07-25 | Atlassian Pty Ltd. | Method and apparatus for a benchmarking service |
| US10304522B2 (en) | 2017-01-31 | 2019-05-28 | International Business Machines Corporation | Method for low power operation and test using DRAM device |
| CN106874467B (en) * | 2017-02-15 | 2019-12-06 | 百度在线网络技术(北京)有限公司 | Method and apparatus for providing search results |
| US11481644B2 (en) * | 2017-02-17 | 2022-10-25 | Nike, Inc. | Event prediction |
| US9916890B1 (en) * | 2017-02-21 | 2018-03-13 | International Business Machines Corporation | Predicting data correlation using multivalued logical outputs in static random access memory (SRAM) storage cells |
| US10831509B2 (en) | 2017-02-23 | 2020-11-10 | Ab Initio Technology Llc | Dynamic execution of parameterized applications for the processing of keyed network data streams |
| US11947978B2 (en) | 2017-02-23 | 2024-04-02 | Ab Initio Technology Llc | Dynamic execution of parameterized applications for the processing of keyed network data streams |
| US10693867B2 (en) | 2017-03-01 | 2020-06-23 | Futurewei Technologies, Inc. | Apparatus and method for predictive token validation |
| US20180253677A1 (en) * | 2017-03-01 | 2018-09-06 | Gregory James Foster | Method for Performing Dynamic Data Analytics |
| US11068453B2 (en) | 2017-03-09 | 2021-07-20 | data.world, Inc | Determining a degree of similarity of a subset of tabular data arrangements to subsets of graph data arrangements at ingestion into a data-driven collaborative dataset platform |
| US10586359B1 (en) * | 2017-03-09 | 2020-03-10 | Workday, Inc. | Methods and systems for creating waterfall charts |
| US10515233B2 (en) * | 2017-03-19 | 2019-12-24 | International Business Machines Corporation | Automatic generating analytics from blockchain data |
| US11537590B2 (en) * | 2017-03-28 | 2022-12-27 | Walmart Apollo, Llc | Systems and methods for computer assisted database change documentation |
| USD828377S1 (en) * | 2017-04-12 | 2018-09-11 | Intuit Inc. | Display screen with graphical user interface |
| US11030674B2 (en) | 2017-04-14 | 2021-06-08 | International Business Machines Corporation | Cognitive order processing by predicting resalable returns |
| US10242037B2 (en) * | 2017-04-20 | 2019-03-26 | Servicenow, Inc. | Index suggestion engine for relational databases |
| US20180308002A1 (en) * | 2017-04-20 | 2018-10-25 | Bank Of America Corporation | Data processing system with machine learning engine to provide system control functions |
| US20180308008A1 (en) | 2017-04-25 | 2018-10-25 | Xaxis, Inc. | Double Blind Machine Learning Insight Interface Apparatuses, Methods and Systems |
| US10795901B2 (en) * | 2017-05-09 | 2020-10-06 | Jpmorgan Chase Bank, N.A. | Generic entry and exit network interface system and method |
| US11005864B2 (en) | 2017-05-19 | 2021-05-11 | Salesforce.Com, Inc. | Feature-agnostic behavior profile based anomaly detection |
| US11270023B2 (en) | 2017-05-22 | 2022-03-08 | International Business Machines Corporation | Anonymity assessment system |
| CN107392220B (en) | 2017-05-31 | 2020-05-05 | 创新先进技术有限公司 | Data flow clustering method and device |
| US10663502B2 (en) | 2017-06-02 | 2020-05-26 | International Business Machines Corporation | Real time cognitive monitoring of correlations between variables |
| US10037792B1 (en) | 2017-06-02 | 2018-07-31 | International Business Machines Corporation | Optimizing data approximation analysis using low power circuitry |
| US11526768B2 (en) | 2017-06-02 | 2022-12-13 | International Business Machines Corporation | Real time cognitive reasoning using a circuit with varying confidence level alerts |
| US10598710B2 (en) | 2017-06-02 | 2020-03-24 | International Business Machines Corporation | Cognitive analysis using applied analog circuits |
| US11042891B2 (en) * | 2017-06-05 | 2021-06-22 | International Business Machines Corporation | Optimizing revenue savings for actionable predictions of revenue change |
| US11055730B2 (en) * | 2017-06-05 | 2021-07-06 | International Business Machines Corporation | Optimizing predictive precision for actionable forecasts of revenue change |
| US11062334B2 (en) * | 2017-06-05 | 2021-07-13 | International Business Machines Corporation | Predicting ledger revenue change behavior of clients receiving services |
| CN110869962A (en) | 2017-07-06 | 2020-03-06 | 飒乐有限公司 | Data collation based on computer analysis of data |
| US11481662B1 (en) * | 2017-07-31 | 2022-10-25 | Amazon Technologies, Inc. | Analysis of interactions with data objects stored by a network-based storage service |
| CA3073199A1 (en) * | 2017-08-18 | 2019-02-21 | ISMS Solutions, LLC | Computer based learning system for analyzing agreements |
| US10002154B1 (en) | 2017-08-24 | 2018-06-19 | Illumon Llc | Computer data system data source having an update propagation graph with feedback cyclicality |
| US10616357B2 (en) * | 2017-08-24 | 2020-04-07 | Bank Of America Corporation | Event tracking and notification based on sensed data |
| US11282021B2 (en) * | 2017-09-22 | 2022-03-22 | Jpmorgan Chase Bank, N.A. | System and method for implementing a federated forecasting framework |
| USD847841S1 (en) * | 2017-11-01 | 2019-05-07 | Apple Inc. | Display screen or portion thereof with graphical user interface |
| US10824608B2 (en) * | 2017-11-10 | 2020-11-03 | Salesforce.Com, Inc. | Feature generation and storage in a multi-tenant environment |
| CN109814936A (en) * | 2017-11-20 | 2019-05-28 | 广东欧珀移动通信有限公司 | Application program prediction model establishing and preloading method, device, medium and terminal |
| CN109814937A (en) * | 2017-11-20 | 2019-05-28 | 广东欧珀移动通信有限公司 | Application program prediction model is established, preloads method, apparatus, medium and terminal |
| US20190156232A1 (en) * | 2017-11-21 | 2019-05-23 | Red Hat, Inc. | Job scheduler implementation based on user behavior |
| US10474934B1 (en) | 2017-11-26 | 2019-11-12 | Jasmin Cosic | Machine learning for computing enabled systems and/or devices |
| USD844653S1 (en) * | 2017-11-26 | 2019-04-02 | Jan Magnus Edman | Display screen with graphical user interface |
| US11537931B2 (en) * | 2017-11-29 | 2022-12-27 | Google Llc | On-device machine learning platform to enable sharing of machine-learned models between applications |
| TWI649712B (en) * | 2017-12-08 | 2019-02-01 | 財團法人工業技術研究院 | Electronic device, presentation process module presentation method and computer readable medium |
| US10803108B2 (en) | 2017-12-20 | 2020-10-13 | International Business Machines Corporation | Facilitation of domain and client-specific application program interface recommendations |
| US10831772B2 (en) | 2017-12-20 | 2020-11-10 | International Business Machines Corporation | Facilitation of domain and client-specific application program interface recommendations |
| US11120103B2 (en) * | 2017-12-23 | 2021-09-14 | Salesforce.Com, Inc. | Predicting binary outcomes of an activity |
| CN109976823A (en) * | 2017-12-27 | 2019-07-05 | Tcl集团股份有限公司 | A kind of application program launching method, device and terminal device |
| CN108196838B (en) * | 2017-12-30 | 2021-01-15 | 京信通信系统(中国)有限公司 | Memory data management method and device, storage medium and computer equipment |
| US20190220766A1 (en) * | 2018-01-12 | 2019-07-18 | Gamalon, Inc. | Probabilistic Modeling System and Method |
| US11348126B2 (en) * | 2018-01-15 | 2022-05-31 | The Nielsen Company (Us), Llc | Methods and apparatus for campaign mapping for total audience measurement |
| CN108228879A (en) * | 2018-01-23 | 2018-06-29 | 平安普惠企业管理有限公司 | A kind of data-updating method, storage medium and smart machine |
| US20190235984A1 (en) * | 2018-01-30 | 2019-08-01 | Salesforce.Com, Inc. | Systems and methods for providing predictive performance forecasting for component-driven, multi-tenant applications |
| US10902194B2 (en) | 2018-02-09 | 2021-01-26 | Microsoft Technology Licensing, Llc | Natively handling approximate values in spreadsheet applications |
| US10445422B2 (en) * | 2018-02-09 | 2019-10-15 | Microsoft Technology Licensing, Llc | Identification of sets and manipulation of set data in productivity applications |
| US11023551B2 (en) * | 2018-02-23 | 2021-06-01 | Accenture Global Solutions Limited | Document processing based on proxy logs |
| US11216706B2 (en) * | 2018-03-15 | 2022-01-04 | Datorama Technologies Ltd. | System and method for visually presenting interesting plots of tabular data |
| US20190286840A1 (en) * | 2018-03-15 | 2019-09-19 | Honeywell International Inc. | Controlling access to customer data by external third parties |
| USD873680S1 (en) * | 2018-03-15 | 2020-01-28 | Apple Inc. | Electronic device with graphical user interface |
| USD861014S1 (en) | 2018-03-15 | 2019-09-24 | Apple Inc. | Electronic device with graphical user interface |
| US11023495B2 (en) * | 2018-03-19 | 2021-06-01 | Adobe Inc. | Automatically generating meaningful user segments |
| WO2019180801A1 (en) * | 2018-03-20 | 2019-09-26 | 三菱電機株式会社 | Display device, display system, display screen generation method |
| CN108469783B (en) * | 2018-05-14 | 2021-02-02 | 西北工业大学 | Deep hole roundness error prediction method based on Bayesian network |
| US10915587B2 (en) | 2018-05-18 | 2021-02-09 | Google Llc | Data processing system for generating entries in data structures from network requests |
| US12117997B2 (en) * | 2018-05-22 | 2024-10-15 | Data.World, Inc. | Auxiliary query commands to deploy predictive data models for queries in a networked computing platform |
| JP6805206B2 (en) * | 2018-05-22 | 2020-12-23 | 日本電信電話株式会社 | Search word suggestion device, expression information creation method, and expression information creation program |
| US10540669B2 (en) * | 2018-05-30 | 2020-01-21 | Sas Institute Inc. | Managing object values and resource consumption |
| US12450541B2 (en) * | 2018-06-04 | 2025-10-21 | Zuora, Inc. | Systems and methods for providing tiered subscription data storage in a multi-tenant system |
| US11281686B2 (en) * | 2018-06-04 | 2022-03-22 | Nec Corporation | Information processing apparatus, method, and program |
| CN108984603A (en) * | 2018-06-05 | 2018-12-11 | 试金石信用服务有限公司 | Isomeric data acquisition method, equipment, storage medium and system |
| US10586362B2 (en) * | 2018-06-18 | 2020-03-10 | Microsoft Technology Licensing, Llc | Interactive layout-aware construction of bespoke charts |
| US11816676B2 (en) * | 2018-07-06 | 2023-11-14 | Nice Ltd. | System and method for generating journey excellence score |
| US11210618B2 (en) * | 2018-07-10 | 2021-12-28 | Walmart Apollo, Llc | Systems and methods for generating a two-dimensional planogram based on intermediate data structures |
| CA3105675A1 (en) * | 2018-07-10 | 2020-01-16 | Lymbyc Solutions Private Limited | Machine intelligence for research and analytics (mira) system and method |
| CN110888889B (en) * | 2018-08-17 | 2023-08-15 | 阿里巴巴集团控股有限公司 | Data information updating method, device and equipment |
| US10552541B1 (en) | 2018-08-27 | 2020-02-04 | International Business Machines Corporation | Processing natural language queries based on machine learning |
| USD891454S1 (en) * | 2018-09-11 | 2020-07-28 | Apple Inc. | Electronic device with animated graphical user interface |
| EP3627399B1 (en) * | 2018-09-19 | 2024-08-14 | Tata Consultancy Services Limited | Systems and methods for real time configurable recommendation using user data |
| US11151197B2 (en) * | 2018-09-19 | 2021-10-19 | Palantir Technologies Inc. | Enhanced processing of time series data via parallelization of instructions |
| US10915827B2 (en) * | 2018-09-24 | 2021-02-09 | Salesforce.Com, Inc. | System and method for field value recommendations based on confidence levels in analyzed dataset |
| US11069447B2 (en) * | 2018-09-29 | 2021-07-20 | Intego Group, LLC | Systems and methods for topology-based clinical data mining |
| CN109325092A (en) * | 2018-11-27 | 2019-02-12 | 中山大学 | A Nonparametric Parallelized Hierarchical Dirichlet Process Topic Model System Fusing Phrase Information |
| US20200167414A1 (en) * | 2018-11-28 | 2020-05-28 | Citrix Systems, Inc. | Webform generation and population |
| CN111259201B (en) * | 2018-12-03 | 2023-08-18 | 北京嘀嘀无限科技发展有限公司 | Data maintenance method and system |
| CN109635004B (en) * | 2018-12-13 | 2023-05-05 | 广东工业大学 | A database object description providing method, device and equipment |
| US10936974B2 (en) | 2018-12-24 | 2021-03-02 | Icertis, Inc. | Automated training and selection of models for document analysis |
| KR102102276B1 (en) * | 2018-12-28 | 2020-04-22 | 동국대학교 산학협력단 | Method of measuring similarity between tables based on deep learning technique |
| US11488062B1 (en) * | 2018-12-30 | 2022-11-01 | Perimetrics, Inc. | Determination of structural characteristics of an object |
| US11282093B2 (en) * | 2018-12-31 | 2022-03-22 | Tata Consultancy Services Limited | Method and system for machine learning based item matching by considering user mindset |
| US11068448B2 (en) * | 2019-01-07 | 2021-07-20 | Salesforce.Com, Inc. | Archiving objects in a database environment |
| US10937073B2 (en) * | 2019-01-23 | 2021-03-02 | Intuit Inc. | Predicting delay in a process |
| US11017339B2 (en) * | 2019-02-05 | 2021-05-25 | International Business Machines Corporation | Cognitive labor forecasting |
| US10726374B1 (en) | 2019-02-19 | 2020-07-28 | Icertis, Inc. | Risk prediction based on automated analysis of documents |
| US11321392B2 (en) * | 2019-02-19 | 2022-05-03 | International Business Machines Corporation | Light weight index for querying low-frequency data in a big data environment |
| US11792226B2 (en) | 2019-02-25 | 2023-10-17 | Oracle International Corporation | Automatic api document generation from scim metadata |
| US11170064B2 (en) | 2019-03-05 | 2021-11-09 | Corinne David | Method and system to filter out unwanted content from incoming social media data |
| EP3938929A4 (en) * | 2019-03-15 | 2023-01-11 | 3M Innovative Properties Company | DETERMINATION OF CAUSAL MODELS FOR CONTROLLING ENVIRONMENTS |
| CN110083597A (en) * | 2019-03-16 | 2019-08-02 | 平安普惠企业管理有限公司 | Order querying method, device, computer equipment and storage medium |
| US11709910B1 (en) * | 2019-03-18 | 2023-07-25 | Cigna Intellectual Property, Inc. | Systems and methods for imputing missing values in data sets |
| USD912074S1 (en) * | 2019-03-25 | 2021-03-02 | Warsaw Orthopedic, Inc. | Display screen with graphical user interface for medical treatment and/or diagnostics |
| US11568430B2 (en) | 2019-04-08 | 2023-01-31 | Ebay Inc. | Third-party testing platform |
| US11188671B2 (en) * | 2019-04-11 | 2021-11-30 | Bank Of America Corporation | Distributed data chamber system |
| US11810015B2 (en) | 2019-04-22 | 2023-11-07 | Walmart Apollo, Llc | Forecasting system |
| US20200342302A1 (en) * | 2019-04-24 | 2020-10-29 | Accenture Global Solutions Limited | Cognitive forecasting |
| GB201905966D0 (en) * | 2019-04-29 | 2019-06-12 | Palantir Technologies Inc | Security system and method |
| CN110209449B (en) * | 2019-05-21 | 2022-02-15 | 腾讯科技(深圳)有限公司 | Method and device for positioning cursor in game |
| US12260169B2 (en) * | 2019-05-23 | 2025-03-25 | Sigma Computing, Inc. | Using lightweight references to present a worksheet |
| US11507869B2 (en) * | 2019-05-24 | 2022-11-22 | Digital Lion, LLC | Predictive modeling and analytics for processing and distributing data traffic |
| US11934971B2 (en) | 2019-05-24 | 2024-03-19 | Digital Lion, LLC | Systems and methods for automatically building a machine learning model |
| US11715144B2 (en) | 2019-05-24 | 2023-08-01 | Salesforce, Inc. | Dynamic ranking of recommendation pairings |
| US11373232B2 (en) * | 2019-05-24 | 2022-06-28 | Salesforce.Com, Inc. | Dynamic ranking of recommendation pairings |
| USD913325S1 (en) | 2019-05-31 | 2021-03-16 | Apple Inc. | Electronic device with graphical user interface |
| USD914056S1 (en) | 2019-05-31 | 2021-03-23 | Apple Inc. | Electronic device with animated graphical user interface |
| CN112068986B (en) * | 2019-06-10 | 2024-04-12 | 伊姆西Ip控股有限责任公司 | Method, apparatus and computer program product for managing backup tasks |
| CN110377632B (en) * | 2019-06-17 | 2023-06-20 | 平安科技(深圳)有限公司 | Litigation result prediction method, litigation result prediction device, litigation result prediction computer device and litigation result prediction storage medium |
| US11093378B2 (en) | 2019-06-27 | 2021-08-17 | Capital One Services, Llc | Testing agent for application dependency discovery, reporting, and management tool |
| US10521235B1 (en) | 2019-06-27 | 2019-12-31 | Capital One Services, Llc | Determining problem dependencies in application dependency discovery, reporting, and management tool |
| US10642719B1 (en) | 2019-06-27 | 2020-05-05 | Capital One Services, Llc | Intelligent services for application dependency discovery, reporting, and management tool |
| US10747544B1 (en) | 2019-06-27 | 2020-08-18 | Capital One Services, Llc | Dependency analyzer in application dependency discovery, reporting, and management tool |
| US11354222B2 (en) | 2019-06-27 | 2022-06-07 | Capital One Services, Llc | Discovery crawler for application dependency discovery, reporting, and management tool |
| US11379292B2 (en) | 2019-06-27 | 2022-07-05 | Capital One Services, Llc | Baseline modeling for application dependency discovery, reporting, and management tool |
| US10915428B2 (en) | 2019-06-27 | 2021-02-09 | Capital One Services, Llc | Intelligent services and training agent for application dependency discovery, reporting, and management tool |
| CN112256740A (en) * | 2019-07-22 | 2021-01-22 | 王其宏 | System and method for integrating qualitative data and quantitative data to recommend auditing criteria |
| US11550830B2 (en) * | 2019-07-29 | 2023-01-10 | Pytho, Llc | Systems and methods for multi-source reference class identification, base rate calculation, and prediction |
| CN112307056B (en) * | 2019-07-31 | 2024-02-06 | 华控清交信息科技(北京)有限公司 | Data processing method and device for data processing |
| US11553823B2 (en) * | 2019-08-02 | 2023-01-17 | International Business Machines Corporation | Leveraging spatial scanning data of autonomous robotic devices |
| US20210049159A1 (en) * | 2019-08-15 | 2021-02-18 | International Business Machines Corporation | Visualization and Validation of Cardinality-Constrained Groups of Data Entry Fields |
| US20210064670A1 (en) * | 2019-08-28 | 2021-03-04 | Microsoft Technology Licensing, Llc | Customizing and updating analytics of remote data source |
| US11150886B2 (en) * | 2019-09-03 | 2021-10-19 | Microsoft Technology Licensing, Llc | Automatic probabilistic upgrade of tenant devices |
| JP7309533B2 (en) * | 2019-09-06 | 2023-07-18 | 株式会社日立製作所 | Model improvement support system |
| USD949179S1 (en) | 2019-09-06 | 2022-04-19 | Apple Inc. | Display screen or portion thereof with animated graphical user interface |
| US11687378B2 (en) * | 2019-09-13 | 2023-06-27 | Oracle International Corporation | Multi-tenant identity cloud service with on-premise authentication integration and bridge high availability |
| US11870770B2 (en) | 2019-09-13 | 2024-01-09 | Oracle International Corporation | Multi-tenant identity cloud service with on-premise authentication integration |
| US11151041B2 (en) * | 2019-10-15 | 2021-10-19 | Micron Technology, Inc. | Tokens to indicate completion of data storage |
| KR102314068B1 (en) * | 2019-10-18 | 2021-10-18 | 중앙대학교 산학협력단 | Animal hospital integration data base building system and method |
| US11347736B2 (en) * | 2019-10-30 | 2022-05-31 | Boray Data Technology Co. Ltd. | Dynamic query optimization |
| US11182841B2 (en) | 2019-11-08 | 2021-11-23 | Accenture Global Solutions Limited | Prospect recommendation |
| CN111126018B (en) * | 2019-11-25 | 2023-08-08 | 泰康保险集团股份有限公司 | Form generation method and device, storage medium and electronic equipment |
| US11625736B2 (en) * | 2019-12-02 | 2023-04-11 | Oracle International Corporation | Using machine learning to train and generate an insight engine for determining a predicted sales insight |
| US11526665B1 (en) * | 2019-12-11 | 2022-12-13 | Amazon Technologies, Inc. | Determination of root causes of customer returns |
| WO2021119379A1 (en) | 2019-12-12 | 2021-06-17 | Applied Underwriters, Inc. | Interactive stochastic design tool |
| CN111125199B (en) * | 2019-12-30 | 2023-06-13 | 中国农业银行股份有限公司 | Database access method and device and electronic equipment |
| US11663617B2 (en) * | 2020-01-03 | 2023-05-30 | Sap Se | Dynamic file generation system |
| USD963742S1 (en) | 2020-01-09 | 2022-09-13 | Apple Inc. | Type font |
| USD963741S1 (en) | 2020-01-09 | 2022-09-13 | Apple Inc. | Type font |
| US11810089B2 (en) * | 2020-01-14 | 2023-11-07 | Snowflake Inc. | Data exchange-based platform |
| CN112632105B (en) * | 2020-01-17 | 2021-09-10 | 华东师范大学 | System and method for verifying correctness of large-scale transaction load generation and database isolation level |
| US11829913B2 (en) * | 2020-01-18 | 2023-11-28 | SkyKick, Inc. | Facilitating activity logs within a multi-service system |
| FI20205171A1 (en) * | 2020-02-20 | 2021-08-21 | Q Factory Oy | Intelligent database system and method |
| CN111291074B (en) * | 2020-02-27 | 2023-03-28 | 北京思特奇信息技术股份有限公司 | Database query method, system, medium and device |
| SG11202103578SA (en) * | 2020-02-27 | 2021-10-28 | Intercontinental Exchange Holdings Inc | Integrated weather graphical user interface |
| CN113326409A (en) * | 2020-02-29 | 2021-08-31 | 华为技术有限公司 | Table display method, equipment and system |
| US11163761B2 (en) | 2020-03-20 | 2021-11-02 | International Business Machines Corporation | Vector embedding models for relational tables with null or equivalent values |
| US10860609B1 (en) * | 2020-03-25 | 2020-12-08 | Snowflake Inc. | Distributed stop operator for query processing |
| US12248957B2 (en) * | 2020-03-30 | 2025-03-11 | Google Llc | Geographic dataset preparation and analytics systems |
| CN111523034B (en) * | 2020-04-24 | 2023-08-18 | 腾讯科技(深圳)有限公司 | Application processing method, device, equipment and medium |
| IL318034B1 (en) * | 2020-05-24 | 2025-11-01 | Quixotic Labs Inc | Domain-specific language interpreter and interactive visual interface for rapid screening |
| US11526756B1 (en) * | 2020-06-24 | 2022-12-13 | Amazon Technologies, Inc. | Artificial intelligence system with composite models for multiple response-string queries |
| US11823044B2 (en) * | 2020-06-29 | 2023-11-21 | Paypal, Inc. | Query-based recommendation systems using machine learning-trained classifier |
| US11461292B2 (en) | 2020-07-01 | 2022-10-04 | International Business Machines Corporation | Quick data exploration |
| US20220019909A1 (en) * | 2020-07-14 | 2022-01-20 | Adobe Inc. | Intent-based command recommendation generation in an analytics system |
| US11526825B2 (en) * | 2020-07-27 | 2022-12-13 | Cygnvs Inc. | Cloud-based multi-tenancy computing systems and methods for providing response control and analytics |
| CN111913987B (en) * | 2020-08-10 | 2023-08-04 | 东北大学 | A distributed query system and method based on dimension group-time-space-probability filtering |
| US11593014B2 (en) * | 2020-08-14 | 2023-02-28 | EMC IP Holding Company LLC | System and method for approximating replication completion time |
| US12190251B2 (en) * | 2020-08-25 | 2025-01-07 | Alteryx, Inc. | Hybrid machine learning |
| USD949169S1 (en) | 2020-09-14 | 2022-04-19 | Apple Inc. | Display screen or portion thereof with graphical user interface |
| US11460975B2 (en) * | 2020-09-18 | 2022-10-04 | Salesforce, Inc. | Metric presentation within a flow builder |
| CN112347390A (en) * | 2020-09-27 | 2021-02-09 | 北京淇瑀信息科技有限公司 | Channel contract mapping-based resource consumption optimization method and device and electronic equipment |
| CN112182134B (en) * | 2020-09-30 | 2024-04-30 | 北京超图软件股份有限公司 | Construction method and device of space-time database of service system |
| US20220114624A1 (en) * | 2020-10-09 | 2022-04-14 | Adobe Inc. | Digital Content Text Processing and Review Techniques |
| CN112364613B (en) * | 2020-10-30 | 2024-05-03 | 中国运载火箭技术研究院 | Automatic generation system for aircraft test data interpretation report |
| US11188833B1 (en) * | 2020-11-05 | 2021-11-30 | Birdview Films. Llc | Real-time predictive knowledge pattern machine |
| US11948083B2 (en) * | 2020-11-16 | 2024-04-02 | UMNAI Limited | Method for an explainable autoencoder and an explainable generative adversarial network |
| CN112380215B (en) * | 2020-11-17 | 2023-07-28 | 北京融七牛信息技术有限公司 | Automatic feature generation method based on cross aggregation |
| TWI776287B (en) * | 2020-11-24 | 2022-09-01 | 威聯通科技股份有限公司 | Cloud file accessing apparatus and method |
| US12530436B1 (en) | 2020-12-15 | 2026-01-20 | Amdocs Development Limited | System, method, and computer program for orchestrating time-limited AI-inferencing |
| US11645595B2 (en) * | 2020-12-15 | 2023-05-09 | International Business Machines Corporation | Predictive capacity optimizer |
| CN112540879B (en) * | 2020-12-16 | 2024-08-02 | 北京机电工程研究所 | Voting method for double-path redundant interface data |
| CA3144091A1 (en) * | 2020-12-28 | 2022-06-28 | Carbeeza Ltd. | Computer system |
| US20220207007A1 (en) * | 2020-12-30 | 2022-06-30 | Vision Insight Ai Llp | Artificially intelligent master data management |
| US20220222695A1 (en) * | 2021-01-13 | 2022-07-14 | Mastercard International Incorporated | Content communications system with conversation-to-topic microtrend mapping |
| US11301271B1 (en) * | 2021-01-21 | 2022-04-12 | Servicenow, Inc. | Configurable replacements for empty states in user interfaces |
| US11714855B2 (en) | 2021-01-29 | 2023-08-01 | International Business Machines Corporation | Virtual dialog system performance assessment and enrichment |
| US12073423B2 (en) * | 2021-01-30 | 2024-08-27 | Walmart Apollo, Llc | Methods and apparatus for generating target labels from multi-dimensional time series data |
| US12406194B1 (en) | 2021-03-10 | 2025-09-02 | Jasmin Cosic | Devices, systems, and methods for machine consciousness |
| US11657819B2 (en) | 2021-03-25 | 2023-05-23 | Bank Of America Corporation | Selective use of tools for automatically identifying, accessing, and retrieving information responsive to voice requests |
| US11798551B2 (en) | 2021-03-25 | 2023-10-24 | Bank Of America Corporation | System and method for voice controlled automatic information access and retrieval |
| US11782974B2 (en) | 2021-03-25 | 2023-10-10 | Bank Of America Corporation | System and method for dynamically identifying and retrieving information responsive to voice requests |
| CN112925998B (en) * | 2021-03-30 | 2023-07-25 | 北京奇艺世纪科技有限公司 | Interface data processing method, device and system, electronic equipment and storage medium |
| USD1008290S1 (en) * | 2021-04-30 | 2023-12-19 | Siemens Energy Global GmbH & Co. KG | Display screen or portion thereof with a graphical user interface |
| USD1008291S1 (en) * | 2021-04-30 | 2023-12-19 | Siemens Energy Global GmbH & Co. KG | Display screen or portion thereof with a graphical user interface |
| US11645273B2 (en) * | 2021-05-28 | 2023-05-09 | Ocient Holdings LLC | Query execution utilizing probabilistic indexing |
| US12360828B2 (en) | 2021-07-28 | 2025-07-15 | Red Hat, Inc. | Exposing a cloud API based on supported hardware |
| US11630837B2 (en) * | 2021-08-02 | 2023-04-18 | Francis Kanneh | Computer-implemented system and method for creating forecast charts |
| US11477208B1 (en) | 2021-09-15 | 2022-10-18 | Cygnvs Inc. | Systems and methods for providing collaboration rooms with dynamic tenancy and role-based security |
| US12041062B2 (en) | 2021-09-15 | 2024-07-16 | Cygnvs Inc. | Systems for securely tracking incident data and automatically generating data incident reports using collaboration rooms with dynamic tenancy |
| US11354430B1 (en) | 2021-09-16 | 2022-06-07 | Cygnvs Inc. | Systems and methods for dynamically establishing and managing tenancy using templates |
| CN113778424A (en) * | 2021-09-27 | 2021-12-10 | 常州市公共资源交易中心 | Review configuration method, device and storage medium |
| CN114003590B (en) * | 2021-10-29 | 2024-04-30 | 厦门大学 | Quality control method for ocean buoy surface environmental element data |
| US12493608B2 (en) | 2021-11-23 | 2025-12-09 | Express Scripts Strategic Development, Inc. | Automated file correction and fallout processing for failed database entities |
| US11361034B1 (en) | 2021-11-30 | 2022-06-14 | Icertis, Inc. | Representing documents using document keys |
| CN114116731B (en) * | 2022-01-24 | 2022-04-22 | 北京智象信息技术有限公司 | Data separation storage display method and device based on indexedDB storage |
| US11860848B2 (en) * | 2022-01-26 | 2024-01-02 | Applica sp. z o.o. | Encoder-decoder transformer for table generation |
| US11468369B1 (en) | 2022-01-28 | 2022-10-11 | Databricks Inc. | Automated processing of multiple prediction generation including model tuning |
| US20230244720A1 (en) * | 2022-01-28 | 2023-08-03 | Databricks Inc. | Access of data and models associated with multiple prediction generation |
| US11727038B1 (en) * | 2022-01-31 | 2023-08-15 | Vast Data Ltd. | Tabular database regrouping |
| US20230281649A1 (en) * | 2022-03-02 | 2023-09-07 | Amdocs Development Limited | System, method, and computer program for intelligent value stream management |
| US11709994B1 (en) * | 2022-03-04 | 2023-07-25 | Google Llc | Contextual answer generation in spreadsheets |
| US20230368068A1 (en) * | 2022-05-12 | 2023-11-16 | Microsoft Technology Licensing, Llc | Training and implementing a data quality verification model to validate recurring data pipelines |
| USD1026900S1 (en) | 2022-05-20 | 2024-05-14 | Apple Inc. | Wearable device with graphical user interface |
| US11947551B2 (en) * | 2022-05-27 | 2024-04-02 | Maplebear Inc. | Automated sampling of query results for training of a query engine |
| US11907652B2 (en) * | 2022-06-02 | 2024-02-20 | On Time Staffing, Inc. | User interface and systems for document creation |
| US20230419338A1 (en) * | 2022-06-22 | 2023-12-28 | International Business Machines Corporation | Joint learning of time-series models leveraging natural language processing |
| US12307478B2 (en) * | 2022-07-13 | 2025-05-20 | Hahn Stats, Llc | Method and apparatus for dynamically adjusting to impact of media mentions |
| US11861732B1 (en) * | 2022-07-27 | 2024-01-02 | Intuit Inc. | Industry-profile service for fraud detection |
| US12056473B2 (en) | 2022-08-01 | 2024-08-06 | Servicenow, Inc. | Low-code / no-code layer for interactive application development |
| US12381795B2 (en) | 2022-08-18 | 2025-08-05 | Data Robot, Inc. | Self-join automated feature discovery |
| US20240086570A1 (en) * | 2022-09-12 | 2024-03-14 | Relyance Inc. | Technologies for use of observability data for data privacy, data protection, and data governance |
| US11954167B1 (en) * | 2022-12-21 | 2024-04-09 | Google Llc | Techniques for presenting graphical content in a search result |
| CN116303787A (en) * | 2023-03-15 | 2023-06-23 | 北京人大金仓信息技术股份有限公司 | Data processing method, storage medium and equipment of database cluster |
| CN116860786A (en) * | 2023-07-11 | 2023-10-10 | 北京火山引擎科技有限公司 | Database-based data query method, device, electronic equipment and storage medium |
| US11899636B1 (en) | 2023-07-13 | 2024-02-13 | Fmr Llc | Capturing and maintaining a timeline of data changes in a relational database system |
| US12380095B2 (en) * | 2023-10-10 | 2025-08-05 | Sap Se | Framework for query parameterization |
| US20250139276A1 (en) * | 2023-10-27 | 2025-05-01 | Sap Se | User-specific access control for metadata tables |
| US12135765B1 (en) * | 2023-12-28 | 2024-11-05 | The Strategic Coach Inc. | Apparatus and methods for determining a probability datum |
| US12452126B2 (en) | 2024-02-13 | 2025-10-21 | T-Mobile Usa, Inc. | Provisioning flow troubleshooting tool |
| CN117975696B (en) * | 2024-03-28 | 2024-07-05 | 南京邦固消防科技有限公司 | Linkage type fire alarm control system and method |
| US12432121B1 (en) * | 2024-04-02 | 2025-09-30 | Dell Products L.P. | Rollback orchestration module for deployed and dependent forecasting models at the edge |
| US12298997B1 (en) * | 2024-06-21 | 2025-05-13 | BigObject Private Limited | Data exploration apparatus, cascading data exploration method, and non-transitory computer readable storage medium thereof |
| CN118861457B (en) * | 2024-07-05 | 2025-03-18 | 深圳正中云有限公司 | A method and system for dynamically generating business forms |
| US12387013B1 (en) * | 2024-12-30 | 2025-08-12 | Athos Therapeutics Inc. | Data integration and quality control system |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6553383B1 (en) * | 1997-06-12 | 2003-04-22 | Guillaume Martin | Device for data analysis and organization |
| US20030105732A1 (en) * | 2000-11-17 | 2003-06-05 | Kagalwala Raxit A. | Database schema for structure query language (SQL) server |
| US20050154710A1 (en) * | 2004-01-08 | 2005-07-14 | International Business Machines Corporation | Dynamic bitmap processing, identification and reusability |
| US20060063156A1 (en) * | 2002-12-06 | 2006-03-23 | Willman Cheryl L | Outcome prediction and risk classification in childhood leukemia |
| US20070005420A1 (en) * | 2005-06-30 | 2007-01-04 | Microsoft Corporation | Adjustment of inventory estimates |
Family Cites Families (183)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5369757A (en) * | 1991-06-18 | 1994-11-29 | Digital Equipment Corporation | Recovery logging in the presence of snapshot files by ordering of buffer pool flushing |
| WO1993021587A2 (en) | 1992-04-15 | 1993-10-28 | Inference Corporation | Machine learning with a relational database |
| JPH0689307A (en) | 1992-05-04 | 1994-03-29 | Internatl Business Mach Corp <Ibm> | Device and method for displaying information in database |
| US5649104A (en) | 1993-03-19 | 1997-07-15 | Ncr Corporation | System for allowing user of any computer to draw image over that generated by the host computer and replicating the drawn image to other computers |
| US5608872A (en) | 1993-03-19 | 1997-03-04 | Ncr Corporation | System for allowing all remote computers to perform annotation on an image and replicating the annotated image on the respective displays of other comuters |
| US5577188A (en) | 1994-05-31 | 1996-11-19 | Future Labs, Inc. | Method to provide for virtual screen overlay |
| US5715374A (en) | 1994-06-29 | 1998-02-03 | Microsoft Corporation | Method and system for case-based reasoning utilizing a belief network |
| US5701400A (en) | 1995-03-08 | 1997-12-23 | Amado; Carlos Armando | Method and apparatus for applying if-then-else rules to data sets in a relational data base and generating from the results of application of said rules a database of diagnostics linked to said data sets to aid executive analysis of financial data |
| GB2300991B (en) | 1995-05-15 | 1997-11-05 | Andrew Macgregor Ritchie | Serving signals to browsing clients |
| US5715450A (en) | 1995-09-27 | 1998-02-03 | Siebel Systems, Inc. | Method of selecting and presenting data from a database using a query language to a user of a computer system |
| US5831610A (en) | 1996-02-23 | 1998-11-03 | Netsuite Development L.P. | Designing networks |
| US5821937A (en) | 1996-02-23 | 1998-10-13 | Netsuite Development, L.P. | Computer method for updating a network design |
| US5873096A (en) | 1997-10-08 | 1999-02-16 | Siebel Systems, Inc. | Method of maintaining a network of partially replicated database system |
| US6604117B2 (en) | 1996-03-19 | 2003-08-05 | Siebel Systems, Inc. | Method of maintaining a network of partially replicated database system |
| AU6183698A (en) | 1997-02-26 | 1998-09-18 | Siebel Systems, Inc. | Method of determining visibility to a remote database client of a plurality of database transactions having variable visibility strengths |
| AU6336698A (en) | 1997-02-26 | 1998-09-29 | Siebel Systems, Inc. | Distributed relational database |
| WO1998038762A2 (en) | 1997-02-26 | 1998-09-03 | Siebel Systems, Inc. | Determining visibility to a remote database client |
| AU6440398A (en) | 1997-02-26 | 1998-09-18 | Siebel Systems, Inc. | Method of using a cache to determine the visibility to a remote database client of a plurality of database transactions |
| EP1021775A4 (en) | 1997-02-26 | 2005-05-11 | Siebel Systems Inc | Method of determining the visibility to a remote databaseclient of a plurality of database transactions using simplified visibility rules |
| AU6336798A (en) | 1997-02-27 | 1998-09-29 | Siebel Systems, Inc. | Method of synchronizing independently distributed software and database schema |
| JP2001514776A (en) | 1997-02-27 | 2001-09-11 | シーベル システムズ,インコーポレイティド | A method of continuous level transport of software distribution incorporating local modifications. |
| JP2001513926A (en) | 1997-02-28 | 2001-09-04 | シーベル システムズ,インコーポレイティド | Partially replicated distributed database with multiple levels of remote clients |
| US6169534B1 (en) | 1997-06-26 | 2001-01-02 | Upshot.Com | Graphical user interface for customer information management |
| US5918159A (en) | 1997-08-04 | 1999-06-29 | Fomukong; Mundi | Location reporting satellite paging system with optional blocking of location reporting |
| US6560461B1 (en) | 1997-08-04 | 2003-05-06 | Mundi Fomukong | Authorized location reporting paging system |
| US6629095B1 (en) * | 1997-10-14 | 2003-09-30 | International Business Machines Corporation | System and method for integrating data mining into a relational database management system |
| JPH11203320A (en) * | 1998-01-20 | 1999-07-30 | Hitachi Ltd | Database preprocessing method |
| US20020059095A1 (en) | 1998-02-26 | 2002-05-16 | Cook Rachael Linette | System and method for generating, capturing, and managing customer lead information over a computer network |
| US6732111B2 (en) | 1998-03-03 | 2004-05-04 | Siebel Systems, Inc. | Method, apparatus, system, and program product for attaching files and other objects to a partially replicated database |
| US5963953A (en) | 1998-03-30 | 1999-10-05 | Siebel Systems, Inc. | Method, and system for product configuration |
| US6092086A (en) * | 1998-03-31 | 2000-07-18 | Bmc Software | System and method for handling backout processing during capture of changed data in an enterprise computer system |
| US6510419B1 (en) * | 1998-04-24 | 2003-01-21 | Starmine Corporation | Security analyst performance tracking and analysis system and method |
| AU5791899A (en) | 1998-08-27 | 2000-03-21 | Upshot Corporation | A method and apparatus for network-based sales force management |
| US6601087B1 (en) | 1998-11-18 | 2003-07-29 | Webex Communications, Inc. | Instant document sharing |
| US6728960B1 (en) | 1998-11-18 | 2004-04-27 | Siebel Systems, Inc. | Techniques for managing multiple threads in a browser environment |
| US6393605B1 (en) | 1998-11-18 | 2002-05-21 | Siebel Systems, Inc. | Apparatus and system for efficient delivery and deployment of an application |
| JP2002531896A (en) | 1998-11-30 | 2002-09-24 | シーベル システムズ,インコーポレイティド | Call center using smart script |
| WO2000033238A2 (en) | 1998-11-30 | 2000-06-08 | Siebel Systems, Inc. | Assignment manager |
| WO2000033235A1 (en) | 1998-11-30 | 2000-06-08 | Siebel Systems, Inc. | State models for monitoring processes |
| EP1135723A4 (en) | 1998-11-30 | 2005-02-16 | Siebel Systems Inc | Development tool, method, and system for client server applications |
| US20020072951A1 (en) | 1999-03-03 | 2002-06-13 | Michael Lee | Marketing support database management method, system and program product |
| US6574635B2 (en) | 1999-03-03 | 2003-06-03 | Siebel Systems, Inc. | Application instantiation based upon attributes and values stored in a meta data repository, including tiering of application layers objects and components |
| US6621834B1 (en) | 1999-11-05 | 2003-09-16 | Raindance Communications, Inc. | System and method for voice transmission over network protocols |
| US6535909B1 (en) | 1999-11-18 | 2003-03-18 | Contigo Software, Inc. | System and method for record and playback of collaborative Web browsing session |
| US6324568B1 (en) | 1999-11-30 | 2001-11-27 | Siebel Systems, Inc. | Method and system for distributing objects over a network |
| US6654032B1 (en) | 1999-12-23 | 2003-11-25 | Webex Communications, Inc. | Instant sharing of documents on a remote server |
| US6609050B2 (en) | 2000-01-20 | 2003-08-19 | Daimlerchrysler Corporation | Vehicle warranty and repair computer-networked system |
| US7266502B2 (en) | 2000-03-31 | 2007-09-04 | Siebel Systems, Inc. | Feature centric release manager method and system |
| US6732100B1 (en) | 2000-03-31 | 2004-05-04 | Siebel Systems, Inc. | Database access method and system for user role defined access |
| US6336137B1 (en) | 2000-03-31 | 2002-01-01 | Siebel Systems, Inc. | Web client-server system and method for incompatible page markup and presentation languages |
| US6577726B1 (en) | 2000-03-31 | 2003-06-10 | Siebel Systems, Inc. | Computer telephony integration hotelling method and system |
| US6842748B1 (en) | 2000-04-14 | 2005-01-11 | Rightnow Technologies, Inc. | Usage based strength between related information in an information retrieval system |
| US6434550B1 (en) | 2000-04-14 | 2002-08-13 | Rightnow Technologies, Inc. | Temporal updates of relevancy rating of retrieved information in an information search system |
| US6665655B1 (en) | 2000-04-14 | 2003-12-16 | Rightnow Technologies, Inc. | Implicit rating of retrieved information in an information search system |
| US7730072B2 (en) | 2000-04-14 | 2010-06-01 | Rightnow Technologies, Inc. | Automated adaptive classification system for knowledge networks |
| US6763501B1 (en) | 2000-06-09 | 2004-07-13 | Webex Communications, Inc. | Remote document serving |
| US7877312B2 (en) | 2000-06-22 | 2011-01-25 | Wgal, Llp | Apparatus and method for displaying trading trends |
| US7249048B1 (en) * | 2000-06-30 | 2007-07-24 | Ncr Corporation | Incorporating predicrive models within interactive business analysis processes |
| US7117208B2 (en) * | 2000-09-28 | 2006-10-03 | Oracle Corporation | Enterprise web mining system and method |
| KR100365357B1 (en) | 2000-10-11 | 2002-12-18 | 엘지전자 주식회사 | Method for data communication of mobile terminal |
| US7080026B2 (en) * | 2000-10-27 | 2006-07-18 | Manugistics, Inc. | Supply chain demand forecasting and planning |
| US7581230B2 (en) | 2001-02-06 | 2009-08-25 | Siebel Systems, Inc. | Adaptive communication application programming interface |
| USD454139S1 (en) | 2001-02-20 | 2002-03-05 | Rightnow Technologies | Display screen for a computer |
| US6785684B2 (en) * | 2001-03-27 | 2004-08-31 | International Business Machines Corporation | Apparatus and method for determining clustering factor in a database using block level sampling |
| US7174514B2 (en) | 2001-03-28 | 2007-02-06 | Siebel Systems, Inc. | Engine to present a user interface based on a logical structure, such as one for a customer relationship management system, across a web site |
| US6829655B1 (en) | 2001-03-28 | 2004-12-07 | Siebel Systems, Inc. | Method and system for server synchronization with a computing device via a companion device |
| US7363388B2 (en) | 2001-03-28 | 2008-04-22 | Siebel Systems, Inc. | Method and system for direct server synchronization with a computing device |
| US20030206192A1 (en) | 2001-03-31 | 2003-11-06 | Mingte Chen | Asynchronous message push to web browser |
| US20030018705A1 (en) | 2001-03-31 | 2003-01-23 | Mingte Chen | Media-independent communication server |
| US6732095B1 (en) | 2001-04-13 | 2004-05-04 | Siebel Systems, Inc. | Method and apparatus for mapping between XML and relational representations |
| US7761288B2 (en) | 2001-04-30 | 2010-07-20 | Siebel Systems, Inc. | Polylingual simultaneous shipping of software |
| US7111023B2 (en) * | 2001-05-24 | 2006-09-19 | Oracle International Corporation | Synchronous change data capture in a relational database |
| US20020178146A1 (en) | 2001-05-24 | 2002-11-28 | International Business Machines Corporation | System and method for selective object history retention |
| JP2002358402A (en) * | 2001-05-31 | 2002-12-13 | Dentsu Tec Inc | Sales forecasting method based on customer value using three indicator axes |
| US6691115B2 (en) * | 2001-06-15 | 2004-02-10 | Hewlett-Packard Development Company, L.P. | System and method for purging database update image files after completion of associated transactions for a database replication system with multiple audit logs |
| US6711565B1 (en) | 2001-06-18 | 2004-03-23 | Siebel Systems, Inc. | Method, apparatus, and system for previewing search results |
| US6728702B1 (en) | 2001-06-18 | 2004-04-27 | Siebel Systems, Inc. | System and method to implement an integrated search center supporting a full-text search and query on a database |
| US6782383B2 (en) | 2001-06-18 | 2004-08-24 | Siebel Systems, Inc. | System and method to implement a persistent and dismissible search center frame |
| US6763351B1 (en) | 2001-06-18 | 2004-07-13 | Siebel Systems, Inc. | Method, apparatus, and system for attaching search results |
| US20030004971A1 (en) | 2001-06-29 | 2003-01-02 | Gong Wen G. | Automatic generation of data models and accompanying user interfaces |
| JP2003058697A (en) * | 2001-08-02 | 2003-02-28 | Ncr Internatl Inc | Integrating method using computer of prediction model in analytic environment of business |
| US6826582B1 (en) | 2001-09-28 | 2004-11-30 | Emc Corporation | Method and system for using file systems for content management |
| US6724399B1 (en) | 2001-09-28 | 2004-04-20 | Siebel Systems, Inc. | Methods and apparatus for enabling keyboard accelerators in applications implemented via a browser |
| US6978445B2 (en) | 2001-09-28 | 2005-12-20 | Siebel Systems, Inc. | Method and system for supporting user navigation in a browser environment |
| US7761535B2 (en) | 2001-09-28 | 2010-07-20 | Siebel Systems, Inc. | Method and system for server synchronization with a computing device |
| US6993712B2 (en) | 2001-09-28 | 2006-01-31 | Siebel Systems, Inc. | System and method for facilitating user interaction in a browser environment |
| US7962565B2 (en) | 2001-09-29 | 2011-06-14 | Siebel Systems, Inc. | Method, apparatus and system for a mobile web client |
| US8359335B2 (en) | 2001-09-29 | 2013-01-22 | Siebel Systems, Inc. | Computing system and method to implicitly commit unsaved data for a world wide web application |
| US6901595B2 (en) | 2001-09-29 | 2005-05-31 | Siebel Systems, Inc. | Method, apparatus, and system for implementing a framework to support a web-based application |
| US7146617B2 (en) | 2001-09-29 | 2006-12-05 | Siebel Systems, Inc. | Method, apparatus, and system for implementing view caching in a framework to support web-based applications |
| US6980988B1 (en) * | 2001-10-01 | 2005-12-27 | Oracle International Corporation | Method of applying changes to a standby database system |
| US7289949B2 (en) | 2001-10-09 | 2007-10-30 | Right Now Technologies, Inc. | Method for routing electronic correspondence based on the level and type of emotion contained therein |
| US6804330B1 (en) | 2002-01-04 | 2004-10-12 | Siebel Systems, Inc. | Method and system for accessing CRM data via voice |
| US7058890B2 (en) | 2002-02-13 | 2006-06-06 | Siebel Systems, Inc. | Method and system for enabling connectivity to a data system |
| US7451065B2 (en) * | 2002-03-11 | 2008-11-11 | International Business Machines Corporation | Method for constructing segmentation-based predictive models |
| US7672853B2 (en) | 2002-03-29 | 2010-03-02 | Siebel Systems, Inc. | User interface for processing requests for approval |
| US7131071B2 (en) | 2002-03-29 | 2006-10-31 | Siebel Systems, Inc. | Defining an approval process for requests for approval |
| US6850949B2 (en) | 2002-06-03 | 2005-02-01 | Right Now Technologies, Inc. | System and method for generating a dynamic interface via a communications network |
| US7594181B2 (en) | 2002-06-27 | 2009-09-22 | Siebel Systems, Inc. | Prototyping graphical user interfaces |
| US8639542B2 (en) | 2002-06-27 | 2014-01-28 | Siebel Systems, Inc. | Method and apparatus to facilitate development of a customer-specific business process model |
| US7437720B2 (en) | 2002-06-27 | 2008-10-14 | Siebel Systems, Inc. | Efficient high-interactivity user interface for client-server applications |
| US20040010489A1 (en) | 2002-07-12 | 2004-01-15 | Rightnow Technologies, Inc. | Method for providing search-specific web pages in a network computing environment |
| US7251787B2 (en) | 2002-08-28 | 2007-07-31 | Siebel Systems, Inc. | Method and apparatus for an integrated process modeller |
| US7472114B1 (en) | 2002-09-18 | 2008-12-30 | Symantec Corporation | Method and apparatus to define the scope of a search for information from a tabular data source |
| GB2397401A (en) * | 2003-01-15 | 2004-07-21 | Luke Leonard Martin Porter | Time in databases and applications of databases |
| US9448860B2 (en) | 2003-03-21 | 2016-09-20 | Oracle America, Inc. | Method and architecture for providing data-change alerts to external applications via a push service |
| US7287041B2 (en) | 2003-03-24 | 2007-10-23 | Siebel Systems, Inc. | Data modeling using custom data types |
| WO2004086198A2 (en) | 2003-03-24 | 2004-10-07 | Siebel Systems, Inc. | Common common object |
| US7904340B2 (en) | 2003-03-24 | 2011-03-08 | Siebel Systems, Inc. | Methods and computer-readable medium for defining a product model |
| US8762415B2 (en) | 2003-03-25 | 2014-06-24 | Siebel Systems, Inc. | Modeling of order data |
| US7685515B2 (en) | 2003-04-04 | 2010-03-23 | Netsuite, Inc. | Facilitating data manipulation in a browser-based user interface of an enterprise business application |
| US7620655B2 (en) | 2003-05-07 | 2009-11-17 | Enecto Ab | Method, device and computer program product for identifying visitors of websites |
| US7206965B2 (en) | 2003-05-23 | 2007-04-17 | General Electric Company | System and method for processing a new diagnostics case relative to historical case data and determining a ranking for possible repairs |
| US7409336B2 (en) | 2003-06-19 | 2008-08-05 | Siebel Systems, Inc. | Method and system for searching data based on identified subset of categories and relevance-scored text representation-category combinations |
| US20040260659A1 (en) | 2003-06-23 | 2004-12-23 | Len Chan | Function space reservation system |
| US7237227B2 (en) | 2003-06-30 | 2007-06-26 | Siebel Systems, Inc. | Application user interface template with free-form layout |
| US7694314B2 (en) | 2003-08-28 | 2010-04-06 | Siebel Systems, Inc. | Universal application network architecture |
| US7668950B2 (en) * | 2003-09-23 | 2010-02-23 | Marchex, Inc. | Automatically updating performance-based online advertising system and method |
| US7779039B2 (en) | 2004-04-02 | 2010-08-17 | Salesforce.Com, Inc. | Custom entities and fields in a multi-tenant database system |
| JP4690199B2 (en) * | 2003-10-07 | 2011-06-01 | 株式会社リバース・プロテオミクス研究所 | Method for visualizing correlation data between biological events and computer-readable recording medium |
| US7461089B2 (en) | 2004-01-08 | 2008-12-02 | International Business Machines Corporation | Method and system for creating profiling indices |
| US7167866B2 (en) * | 2004-01-23 | 2007-01-23 | Microsoft Corporation | Selective multi level expansion of data base via pivot point data |
| US20090006156A1 (en) | 2007-01-26 | 2009-01-01 | Herbert Dennis Hunt | Associating a granting matrix with an analytic platform |
| US7171424B2 (en) | 2004-03-04 | 2007-01-30 | International Business Machines Corporation | System and method for managing presentation of data |
| US7590639B1 (en) * | 2004-04-29 | 2009-09-15 | Sap Ag | System and method for ordering a database flush sequence at transaction commit |
| US7398268B2 (en) * | 2004-07-09 | 2008-07-08 | Microsoft Corporation | Systems and methods that facilitate data mining |
| US7289976B2 (en) | 2004-12-23 | 2007-10-30 | Microsoft Corporation | Easy-to-use data report specification |
| JP2006215936A (en) | 2005-02-07 | 2006-08-17 | Hitachi Ltd | Search system and search method |
| US20060218132A1 (en) * | 2005-03-25 | 2006-09-28 | Oracle International Corporation | Predictive data mining SQL functions (operators) |
| US7752048B2 (en) | 2005-05-27 | 2010-07-06 | Oracle International Corporation | Method and apparatus for providing speech recognition resolution on a database |
| US20070073685A1 (en) * | 2005-09-26 | 2007-03-29 | Robert Thibodeau | Systems and methods for valuing receivables |
| US20070136429A1 (en) | 2005-12-09 | 2007-06-14 | Fine Leslie R | Methods and systems for building participant profiles |
| US8065326B2 (en) | 2006-02-01 | 2011-11-22 | Oracle International Corporation | System and method for building decision trees in a database |
| US7743052B2 (en) * | 2006-02-14 | 2010-06-22 | International Business Machines Corporation | Method and apparatus for projecting the effect of maintaining an auxiliary database structure for use in executing database queries |
| CN101093496A (en) * | 2006-06-23 | 2007-12-26 | 微软公司 | Multi-stage associate storage structure and storage method thereof |
| US8693690B2 (en) | 2006-12-04 | 2014-04-08 | Red Hat, Inc. | Organizing an extensible table for storing cryptographic objects |
| US8954500B2 (en) | 2008-01-04 | 2015-02-10 | Yahoo! Inc. | Identifying and employing social network relationships |
| US7788200B2 (en) * | 2007-02-02 | 2010-08-31 | Microsoft Corporation | Goal seeking using predictive analytics |
| US7797356B2 (en) * | 2007-02-02 | 2010-09-14 | Microsoft Corporation | Dynamically detecting exceptions based on data changes |
| US7680882B2 (en) | 2007-03-06 | 2010-03-16 | Friendster, Inc. | Multimedia aggregation in an online social network |
| JP2008269215A (en) * | 2007-04-19 | 2008-11-06 | Nippon Telegr & Teleph Corp <Ntt> | Singular pattern detection system, model learning device, singular pattern detection method, and computer program |
| US7987161B2 (en) | 2007-08-23 | 2011-07-26 | Thomson Reuters (Markets) Llc | System and method for data compression using compression hardware |
| US20090119172A1 (en) | 2007-11-02 | 2009-05-07 | Soloff David L | Advertising Futures Marketplace Methods and Systems |
| US20100318511A1 (en) | 2007-11-13 | 2010-12-16 | VirtualAgility | Techniques for connectors in a system for collaborative work |
| US8126881B1 (en) * | 2007-12-12 | 2012-02-28 | Vast.com, Inc. | Predictive conversion systems and methods |
| US8876607B2 (en) * | 2007-12-18 | 2014-11-04 | Yahoo! Inc. | Visual display of fantasy sports team starting roster data trends |
| US8234248B2 (en) * | 2008-01-24 | 2012-07-31 | Oracle International Corporation | Tracking changes to a business object |
| US8171021B2 (en) | 2008-06-23 | 2012-05-01 | Google Inc. | Query identification and association |
| US20100131496A1 (en) * | 2008-11-26 | 2010-05-27 | Yahoo! Inc. | Predictive indexing for fast search |
| US20100211485A1 (en) * | 2009-02-17 | 2010-08-19 | Augustine Nancy L | Systems and methods of time period comparisons |
| FR2944006B1 (en) | 2009-04-03 | 2011-04-01 | Inst Francais Du Petrole | BACTERIA CAPABLE OF DEGRADING MULTIPLE PETROLEUM COMPOUNDS IN SOLUTION IN AQUEOUS EFFLUENTS AND PROCESS FOR TREATING SAID EFFLUENTS |
| US8645337B2 (en) | 2009-04-30 | 2014-02-04 | Oracle International Corporation | Storing compression units in relational tables |
| US20100287146A1 (en) * | 2009-05-11 | 2010-11-11 | Dean Skelton | System and method for change analytics based forecast and query optimization and impact identification in a variance-based forecasting system with visualization |
| US20100299367A1 (en) | 2009-05-20 | 2010-11-25 | Microsoft Corporation | Keyword Searching On Database Views |
| US20100324927A1 (en) | 2009-06-17 | 2010-12-23 | Tinsley Eric C | Senior care navigation systems and methods for using the same |
| US9852193B2 (en) * | 2009-08-10 | 2017-12-26 | Ebay Inc. | Probabilistic clustering of an item |
| US8706715B2 (en) | 2009-10-05 | 2014-04-22 | Salesforce.Com, Inc. | Methods and systems for joining indexes for query optimization in a multi-tenant database |
| JP2011154554A (en) * | 2010-01-27 | 2011-08-11 | Nec Corp | Deficit value prediction device, deficit value prediction method, and deficit value prediction program |
| US8271435B2 (en) * | 2010-01-29 | 2012-09-18 | Oracle International Corporation | Predictive categorization |
| US8874600B2 (en) | 2010-01-30 | 2014-10-28 | International Business Machines Corporation | System and method for building a cloud aware massive data analytics solution background |
| CN102193939B (en) * | 2010-03-10 | 2016-04-06 | 阿里巴巴集团控股有限公司 | The implementation method of information navigation, information navigation server and information handling system |
| WO2011130706A2 (en) * | 2010-04-16 | 2011-10-20 | Salesforce.Com, Inc. | Methods and systems for performing cross store joins in a multi-tenant store |
| US10162851B2 (en) * | 2010-04-19 | 2018-12-25 | Salesforce.Com, Inc. | Methods and systems for performing cross store joins in a multi-tenant store |
| US20110282806A1 (en) | 2010-05-12 | 2011-11-17 | Jarrod Wilcox | Method and apparatus for investment allocation |
| JP5440394B2 (en) * | 2010-05-31 | 2014-03-12 | ソニー株式会社 | Evaluation prediction apparatus, evaluation prediction method, and program |
| CN101894316A (en) * | 2010-06-10 | 2010-11-24 | 焦点科技股份有限公司 | A method and system for monitoring index of international market prosperity |
| US20120215560A1 (en) * | 2010-07-21 | 2012-08-23 | dbMotion Ltd. | System and methods for facilitating computerized interactions with emrs |
| US8903805B2 (en) | 2010-08-20 | 2014-12-02 | Oracle International Corporation | Method and system for performing query optimization using a hybrid execution plan |
| US20120072972A1 (en) * | 2010-09-20 | 2012-03-22 | Microsoft Corporation | Secondary credentials for batch system |
| JP2012194741A (en) * | 2011-03-16 | 2012-10-11 | Nec Corp | Prediction device of missing value in matrix data, method for calculating missing value prediction, and missing value prediction program |
| US9235620B2 (en) * | 2012-08-14 | 2016-01-12 | Amadeus S.A.S. | Updating cached database query results |
| US20120310763A1 (en) | 2011-06-06 | 2012-12-06 | Michael Meehan | System and methods for matching potential buyers and sellers of complex offers |
| US20120317058A1 (en) | 2011-06-13 | 2012-12-13 | Abhulimen Kingsley E | Design of computer based risk and safety management system of complex production and multifunctional process facilities-application to fpso's |
| US8773437B1 (en) | 2011-07-12 | 2014-07-08 | Relationship Science LLC | Weighting paths in a social graph based on time |
| CN102254034A (en) * | 2011-08-08 | 2011-11-23 | 浙江鸿程计算机系统有限公司 | Online analytical processing (OLAP) query log mining and recommending method based on efficient mining of frequent closed sequences (BIDE) |
| US11755663B2 (en) * | 2012-10-22 | 2023-09-12 | Recorded Future, Inc. | Search activity prediction |
| WO2013086384A1 (en) | 2011-12-08 | 2013-06-13 | Oracle International Corporation | Techniques for maintaining column vectors of relational data within volatile memory |
| US20140040162A1 (en) | 2012-02-21 | 2014-02-06 | Salesforce.Com, Inc. | Method and system for providing information from a customer relationship management system |
| US9613014B2 (en) | 2012-03-09 | 2017-04-04 | AgileQR, Inc. | Systems and methods for personalization and engagement by passive connection |
| US8983936B2 (en) | 2012-04-04 | 2015-03-17 | Microsoft Corporation | Incremental visualization for structured data in an enterprise-level data store |
| US20140019207A1 (en) | 2012-07-11 | 2014-01-16 | Sap Ag | Interactive in-memory based sales forecasting |
| US10152511B2 (en) * | 2012-09-14 | 2018-12-11 | Salesforce.Com, Inc. | Techniques for optimization of inner queries |
| US20140149554A1 (en) | 2012-11-29 | 2014-05-29 | Ricoh Co., Ltd. | Unified Server for Managing a Heterogeneous Mix of Devices |
-
2013
- 2013-08-29 US US14/014,225 patent/US9342836B2/en active Active
- 2013-08-29 US US14/014,204 patent/US20140280065A1/en not_active Abandoned
- 2013-08-29 US US14/014,258 patent/US9240016B2/en active Active
- 2013-08-29 US US14/014,241 patent/US9336533B2/en active Active
- 2013-08-29 US US14/014,264 patent/US9235846B2/en active Active
- 2013-08-29 US US14/014,250 patent/US9349132B2/en active Active
- 2013-08-29 US US14/014,221 patent/US9367853B2/en active Active
- 2013-08-29 US US14/014,236 patent/US9454767B2/en active Active
- 2013-08-29 US US14/014,269 patent/US9390428B2/en active Active
- 2013-08-29 US US14/014,271 patent/US10860557B2/en active Active
- 2013-11-14 EP EP13798495.1A patent/EP2973004A1/en not_active Ceased
- 2013-11-14 JP JP2016500106A patent/JP6412550B2/en active Active
- 2013-11-14 CN CN201910477454.0A patent/CN110309119B/en active Active
- 2013-11-14 WO PCT/US2013/070198 patent/WO2014143208A1/en not_active Ceased
- 2013-11-14 CA CA2904526A patent/CA2904526C/en active Active
- 2013-11-14 CN CN201380076609.0A patent/CN105229633B/en active Active
-
2016
- 2016-01-11 US US14/992,925 patent/US9753962B2/en active Active
- 2016-06-13 US US15/181,256 patent/US9690815B2/en active Active
- 2016-08-26 US US15/249,026 patent/US10963541B2/en active Active
-
2018
- 2018-09-28 JP JP2018183155A patent/JP6608500B2/en active Active
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6553383B1 (en) * | 1997-06-12 | 2003-04-22 | Guillaume Martin | Device for data analysis and organization |
| US20030105732A1 (en) * | 2000-11-17 | 2003-06-05 | Kagalwala Raxit A. | Database schema for structure query language (SQL) server |
| US20060063156A1 (en) * | 2002-12-06 | 2006-03-23 | Willman Cheryl L | Outcome prediction and risk classification in childhood leukemia |
| US20050154710A1 (en) * | 2004-01-08 | 2005-07-14 | International Business Machines Corporation | Dynamic bitmap processing, identification and reusability |
| US20070005420A1 (en) * | 2005-06-30 | 2007-01-04 | Microsoft Corporation | Adjustment of inventory estimates |
Non-Patent Citations (4)
| Title |
|---|
| Hohpe, "Google Cloud Data Platform & Services," Year 2010 * |
| UCLA, "Frequently Asked Questions about MLwiN How do I create predicted values?" Sep. 18, 2002 * |
| Williams, "Prior Knowledge: A Predictive Database For Developers," September 8-12, 2012 * |
| Wilson, "Machine Learning Throwdown" Year 2012 * |
Cited By (52)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20150327135A1 (en) * | 2014-04-24 | 2015-11-12 | Futurewei Technologies, Inc. | Apparatus and method for dynamic hybrid routing in sdn networks to avoid congestion and balance loads under changing traffic load |
| US9680665B2 (en) * | 2014-04-24 | 2017-06-13 | Futurewei Technologies, Inc. | Apparatus and method for dynamic hybrid routing in SDN networks to avoid congestion and balance loads under changing traffic load |
| US20150310021A1 (en) * | 2014-04-28 | 2015-10-29 | International Business Machines Corporation | Big data analytics brokerage |
| US10430401B2 (en) * | 2014-04-28 | 2019-10-01 | International Business Machines Corporation | Big data analytics brokerage |
| US20160110362A1 (en) * | 2014-10-20 | 2016-04-21 | International Business Machines Corporation | Automatic enumeration of data analysis options and rapid analysis of statistical models |
| US20160110410A1 (en) * | 2014-10-20 | 2016-04-21 | International Business Machines Corporation | Automatic enumeration of data analysis options and rapid analysis of statistical models |
| US10353890B2 (en) * | 2014-10-20 | 2019-07-16 | International Business Machines Corporation | Automatic enumeration of data analysis options and rapid analysis of statistical models |
| US10346393B2 (en) * | 2014-10-20 | 2019-07-09 | International Business Machines Corporation | Automatic enumeration of data analysis options and rapid analysis of statistical models |
| US10942911B2 (en) | 2015-03-20 | 2021-03-09 | D&B Business Information Solutions | Aggregating high volumes of temporal data from multiple overlapping sources |
| US20160292236A1 (en) * | 2015-04-01 | 2016-10-06 | International Business Machines Corporation | Supporting multi-tenant applications on a shared database using pre-defined attributes |
| US20160292216A1 (en) * | 2015-04-01 | 2016-10-06 | International Business Machines Corporation | Supporting multi-tenant applications on a shared database using pre-defined attributes |
| US10628388B2 (en) * | 2015-04-01 | 2020-04-21 | International Business Machines Corporation | Supporting multi-tenant applications on a shared database using pre-defined attributes |
| US10528528B2 (en) * | 2015-04-01 | 2020-01-07 | International Business Machines Corporation | Supporting multi-tenant applications on a shared database using pre-defined attributes |
| US9916351B2 (en) | 2015-10-19 | 2018-03-13 | International Business Machines Corporation | Joining operations in document oriented databases |
| US9418106B1 (en) * | 2015-10-19 | 2016-08-16 | International Business Machines Corporation | Joining operations in document oriented databases |
| US9916360B2 (en) | 2015-10-19 | 2018-03-13 | International Business Machines Corporation | Joining operations in document oriented databases |
| US10262037B2 (en) | 2015-10-19 | 2019-04-16 | International Business Machines Corporation | Joining operations in document oriented databases |
| US10438126B2 (en) * | 2015-12-31 | 2019-10-08 | General Electric Company | Systems and methods for data estimation and forecasting |
| US10179282B2 (en) | 2016-02-26 | 2019-01-15 | Impyrium, Inc. | Joystick input apparatus with living hinges |
| US20170351752A1 (en) * | 2016-06-07 | 2017-12-07 | Panoramix Solutions | Systems and methods for identifying and classifying text |
| US10650008B2 (en) * | 2016-08-26 | 2020-05-12 | International Business Machines Corporation | Parallel scoring of an ensemble model |
| US10902005B2 (en) | 2016-08-26 | 2021-01-26 | International Business Machines Corporation | Parallel scoring of an ensemble model |
| US20180060324A1 (en) * | 2016-08-26 | 2018-03-01 | International Business Machines Corporation | Parallel scoring of an ensemble model |
| US11205103B2 (en) | 2016-12-09 | 2021-12-21 | The Research Foundation for the State University | Semisupervised autoencoder for sentiment analysis |
| US10867085B2 (en) * | 2017-03-10 | 2020-12-15 | General Electric Company | Systems and methods for overlaying and integrating computer aided design (CAD) drawings with fluid models |
| US10977397B2 (en) | 2017-03-10 | 2021-04-13 | Altair Engineering, Inc. | Optimization of prototype and machine design within a 3D fluid modeling environment |
| US10803211B2 (en) | 2017-03-10 | 2020-10-13 | General Electric Company | Multiple fluid model tool for interdisciplinary fluid modeling |
| US10650114B2 (en) | 2017-03-10 | 2020-05-12 | Ge Aviation Systems Llc | Systems and methods for utilizing a 3D CAD point-cloud to automatically create a fluid model |
| US11538591B2 (en) | 2017-03-10 | 2022-12-27 | Altair Engineering, Inc. | Training and refining fluid models using disparate and aggregated machine data |
| US12333224B2 (en) | 2017-03-10 | 2025-06-17 | Altair Engineering, Inc. | Systems and methods for overlaying and integrating computer aided design (CAD) drawings with fluid models |
| US11947882B2 (en) | 2017-03-10 | 2024-04-02 | Altair Engineering, Inc. | Optimization of prototype and machine design within a 3D fluid modeling environment |
| US11379630B2 (en) | 2017-03-10 | 2022-07-05 | Altair Engineering, Inc. | Systems and methods for utilizing a 3D CAD point-cloud to automatically create a fluid model |
| US10963599B2 (en) | 2017-03-10 | 2021-03-30 | Altair Engineering, Inc. | Systems and methods for utilizing a 3D CAD point-cloud to automatically create a fluid model |
| US11714933B2 (en) | 2017-03-10 | 2023-08-01 | Altair Engineering, Inc. | Systems and methods for utilizing a 3D CAD point-cloud to automatically create a fluid model |
| US11004568B2 (en) | 2017-03-10 | 2021-05-11 | Altair Engineering, Inc. | Systems and methods for multi-dimensional fluid modeling of an organism or organ |
| US20180260501A1 (en) * | 2017-03-10 | 2018-09-13 | General Electric Company | Systems and methods for overlaying and integrating computer aided design (cad) drawings with fluid models |
| US11967434B2 (en) | 2017-03-10 | 2024-04-23 | Altair Engineering, Inc. | Systems and methods for multi-dimensional fluid modeling of an organism or organ |
| US20190042932A1 (en) * | 2017-08-01 | 2019-02-07 | Salesforce Com, Inc. | Techniques and Architectures for Deep Learning to Support Security Threat Detection |
| US11915309B1 (en) | 2018-06-12 | 2024-02-27 | Wells Fargo Bank, N.A. | Computer-based systems for calculating risk of asset transfers |
| US11468505B1 (en) | 2018-06-12 | 2022-10-11 | Wells Fargo Bank, N.A. | Computer-based systems for calculating risk of asset transfers |
| US11301467B2 (en) | 2018-06-29 | 2022-04-12 | Security On-Demand, Inc. | Systems and methods for intelligent capture and fast transformations of granulated data summaries in database engines |
| WO2020006567A1 (en) * | 2018-06-29 | 2020-01-02 | Security On-Demand, Inc. | Systems and methods for intelligent capture and fast transformations of granulated data summaries in database engines |
| US10922362B2 (en) * | 2018-07-06 | 2021-02-16 | Clover Health | Models for utilizing siloed data |
| US11620300B2 (en) * | 2018-09-28 | 2023-04-04 | Splunk Inc. | Real-time measurement and system monitoring based on generated dependency graph models of system components |
| US11429627B2 (en) | 2018-09-28 | 2022-08-30 | Splunk Inc. | System monitoring driven by automatically determined operational parameters of dependency graph model with user interface |
| US11947556B1 (en) | 2018-09-28 | 2024-04-02 | Splunk Inc. | Computerized monitoring of a metric through execution of a search query, determining a root cause of the behavior, and providing a notification thereof |
| WO2020118432A1 (en) * | 2018-12-13 | 2020-06-18 | Element Ai Inc. | Data set access for updating machine learning models |
| WO2021024205A1 (en) * | 2019-08-06 | 2021-02-11 | Bosman Philippus Johannes | Method and system of optimizing stock availability and sales opportunity |
| US11720595B2 (en) * | 2020-10-16 | 2023-08-08 | Salesforce, Inc. | Generating a query using training observations |
| US20220121685A1 (en) * | 2020-10-16 | 2022-04-21 | Salesforce.Com, Inc. | Generating a query using training observations |
| US12013826B2 (en) | 2020-11-17 | 2024-06-18 | Coupang Corp. | Systems and methods for database query efficiency improvement |
| US20220277327A1 (en) * | 2021-02-26 | 2022-09-01 | Capital One Services, Llc | Computer-based systems for data distribution allocation utilizing machine learning models and methods of use thereof |
Also Published As
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20140280065A1 (en) | Systems and methods for predictive query implementation and usage in a multi-tenant database system | |
| EP3627399B1 (en) | Systems and methods for real time configurable recommendation using user data | |
| JP2021511582A (en) | Dimensional context propagation technology for optimizing SQL query plans | |
| US10311364B2 (en) | Predictive intelligence for service and support | |
| Lehmann et al. | Technology selection for big data and analytical applications | |
| Shyr et al. | Cognitive Data Analysis for Big Data | |
| Liu et al. | Customer Data Acquisition with Predictive Analytics | |
| Nyumbeka | Using Data Analysis and Information Visualization Techniques to Support the Effective Analysis of Large Financial Data Sets | |
| Mohanty et al. | Big Data Analytics Methodology | |
| El Abbass | Implementing a Bank Sales Analytics Solution and a Predictive model for the Next Best Offer | |
| Claudia | Relevance Of Big Data For Business And Management. Exploratory Insights (Part I) |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: SALESFORCE.COM, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:CRONIN, BEAU;PETSCHULAT, CAP;JONAS, ERIC;AND OTHERS;REEL/FRAME:031309/0189 Effective date: 20130923 |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
| STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |