[go: up one dir, main page]

GB2466341A - Method of graphically creating binary expressions - Google Patents

Method of graphically creating binary expressions Download PDF

Info

Publication number
GB2466341A
GB2466341A GB0919723A GB0919723A GB2466341A GB 2466341 A GB2466341 A GB 2466341A GB 0919723 A GB0919723 A GB 0919723A GB 0919723 A GB0919723 A GB 0919723A GB 2466341 A GB2466341 A GB 2466341A
Authority
GB
United Kingdom
Prior art keywords
datasets
graphical
expressions
user
computer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB0919723A
Other versions
GB0919723D0 (en
Inventor
Andreas Wagener
Stefan Abraham
Christian Fritsche
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Publication of GB0919723D0 publication Critical patent/GB0919723D0/en
Publication of GB2466341A publication Critical patent/GB2466341A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2428Query predicate definition using graphical user interfaces, including menus and forms
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • G06F17/30398
    • G06F17/30967
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Human Computer Interaction (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

A method of generating logical expressions for use in at least one of filtering and searching one or more datasets using computing hardware operable to provide a graphical user interface is provided. The method includes:(a) representing the one or more datasets in a graphical form on the graphical user interface;(b) facilitating user-manipulation of the graphical form to identify one or more portions of the one or more data sets to define and thereby generate the logical expressions; and(c) applying the expressions to the one or more datasets for performing at least one of filtering and searching operation of the one or more datasets. Providing the graphical user interface assists the user to comprehend complex multidimensional datasets and identify terms to be included in logical expressions.

Description

METHOD OF GRAPHICALLY CREATING BINARY EXPRESSIONS
FIELD OF THE INVENTION
The present invention relates to methods of graphically creating binary expressions.
Moreover, the present invention concerns methods of employing such expressions for providing technical effect when controlling systems. Furthermore, the present invention concerns software products recorded on machine-readable media and executable on computing hardware for implementing such methods. Specifically, the invention provides a method of graphically creating binary expressions using graphical value distribution charts.
BACKGROUND OF THE INVENTION
Complex binary expressions are often difficult for inexperienced users to appreciate and use, for example for conducting various types of searches on a database or for controlling an apparatus or system.
For example, a binary expression such as: NOT(xl >2ANDx2<2)OR(x3=4) is capable of being used to select a subset of data points in an n-dimensional space created by value ranges of a set of variables {xl, ... xn}.
However, such complex expressions are often needed in computer programs for filtering datasets to generate a filtered result and for performing operations on the filtered result, for example for controlling a system or making control decisions.
Multidimensional logical expressions are often too complex for inexperienced users to employ correctly, even if support is provided for checking for correct syntax, for example syntax-controlled editors or syntax error markers.
A problem arising in practice is that an alternative approach to appreciating complex logical expressions is required, for example for assisting aforementioned inexperienced users.
Known literature describes transforming a specific simpler representation as a mathematical expression. In US-A-665 8404, there is described a single graphical approach for representing and merging Boolean logic and mathematical relationship operators; there is defined a multidimensional graphical representation. Moreover, US-A-5 175814 discloses a direct manipulation interface for Boolean information retrieval; natural expressions are transformed.
Furthermore, US-A-73 83513 discloses a graphical condition builder for facilitating database queries; flow charts are employed to visualize logical expressions.
A large number of contemporary computer-based tools for processing datasets include a functionality to enter binary expressions that are used to filter the datasets. A common example of such tools is a database tool allowing SQL expressions to be entered which are employed to filter rows of database tables. Tn SQL, binary expressions are referred to as "conditions" and are specified in a WHERE clause of a SELECT statement, namely (SELECT * FROM <table> WHERE <condition>).
SUMMARY OF THE INVENTION
It is an object of the present invention to provide a method of graphically creating binary expressions, for example expressions which are susceptible to being used for at least one of searching and filtering one or more datasets.
This object is achieved by the features of the independent claims. The other claims and the specification disclose advantageous embodiments of the invention.
According to a first aspect of the invention, there is provided a method of generating logical expressions for use in at least one of filtering and searching one or more datasets using computing hardware operable to provide a graphical user interface, characterized in that the method includes: (a) representing the one or more datasets in a graphical form on the graphical user interface; (b) facilitating user-manipulation of the graphical form to identify one or more portions of the one or more datasets to define and thereby generate the logical expressions; and (c) applying the expressions to the one or more datasets for performing at least one of filtering and searching operation of the one or more datasets.
Advantageously, when implementing the method, the method includes representing the one or more datasets in the graphical form on the graphical user interface as representations of numerical variables and/or categorical variables.
Advantageously, when implementing the method, the graphical form includes one or more graphs, for example bar graphs, whose one or more graphical features, for example bars, are user-selectable for defining the logical expressions. As an alternative to bar graphs, pie graphs, line graphs and similar are optionally employed. More advantageously when implementing the method, the method includes dynamically interactively modifying a number of bars presented to the user in response to one or more parameters input by the user via the graphical interface.
Advantageously, when implementing the method, the method includes the logical expressions as a sequence of terms linked by logical AND operators, each term being definable using the numerical variable expressions via the graphical user interface by user-manipulation of the graphical form.
Advantageously, when implementing the method, the method includes forming the expressions via the graphical user interface to include a combination of categorical and numerical variable terms.
Advantageously, when implementing the method, the method includes providing the user via the graphical interface with a choice of logical expressions which are susceptible to being applied for at least one of filtering and search the one or more datasets.
More advantageously, when implementing the method, the one or more graphical features, for example one or more bars, are representative of one or more frequencies of occurrences of variables within one or more numerical limits applied to the one or more datasets.
More advantageously, when implementing the method, the one or more graphical features, for example one or more bars, are representative of one or more frequencies of occurrences of variables within one or more categorical groups.
According to a second aspect of the invention, there is provided a computer system including computing hardware for generating logical expressions for use in at least one of filtering and searching one or more datasets, the computing hardware being operable to provide a graphical user interface, the system being operable: (a) to represent the one or more datasets in a graphical form on the graphical user interface; (b) to facilitate user-manipulation of the graphical form to identify one or more portions of the one or more datasets to define and thereby generate the logical expressions; and (c) to apply the expressions to the one or more datasets for performing at least one of filtering and searching operation of the one or more datasets.
According to a third aspect of the invention, there is provided software recorded on a machine-readable data carrier, the software being executable on computing hardware for implementing a method pursuant to the first aspect of the invention.
It will be appreciated that features of the invention, as defined by the accompanying claims, are susceptible to being combined in any combination without departing from the scope of the invention as defined by these claims.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention together with the above-mentioned and other objects and advantages may best be understood from the following detailed description of the embodiments, but not restricted to the embodiments, wherein is shown in: Fig. 1 a table illustrating an example set of data points, namely categorical and numerical data values; Fig. 2 a schematic view of a bar graph of a type employed when implementing the present invention for counting occurrences of numerical variables; Fig. 3 an example distribution chart for categorical variables; Fig. 4 an example distribution chart for numerical variables; Fig. 5 an example distribution which is graphically manipulated by a user to define an expression [MaritalStatus = "Separated" OR MaritalStatus = "Single"]; Fig. 6 an example distribution which is graphically manipulated by a user to define an expression [(Age >=20 AND Age <40) OR Age> 70]; Fig. 7 an example condition regarding categorical variables; Fig. 8 an example condition regarding categorical variables related to Fig. 7; Fig. 9 an example condition regarding numerical variables; Fig. 10 a screen of a graphical user interface (GUI) providing the user with information of available database tables together with SQL conditions; Fig. 11 a screen providing the user with an opportunity to be graphically presented categorical and numerical variables, and via the graphical representation form SQL expressions or conditions suitable for filtering datasets, for example for filtering and/or searching purposes having technical effect; and Fig. 12 an implementation of the method pursuant to the present invention implemented on a computer system including a display device on which a user graphical interface is provided for implementing the method, the computer system being optionally coupled to apparatus wherein the computer system is operable to control the apparatus for causing one or more technical effects therein.
In the drawings, like elements are referred to with equal reference numerals. The drawings are merely schematic representations, not intended to portray specific parameters of the invention. Moreover, the drawings are intended to depict only typical embodiments of the invention and therefore should not be considered as limiting the scope of the invention.
DETAILED DESCRIPTION OF EXAMPLE EMBODIMENTS
In overview, the present invention concerns a method of graphically creating binary expressions, the method including: (a) providing for graphical exploration of a range of values; and (b) providing for graphical construction of a desired logical expression.
The expression is susceptible to being subsequently used, for example, for database searching, for executing control operations having technical effect, and so forth.
It is assumed, for purposes of the present invention, that a given set of m data points P = {pi, p,1} has n dimensions associated therewith as described by Pi = {x1,... x}; for example, the n dimensions correspond with n variables associated with each data point P. Such data points P are susceptible to being represented in a matrix comprising in rows and n columns.
When implementing the present invention, it is beneficial to distinguish between categorical variables and numerical variables. Categorical variables have a limited number of values, namely are discrete by nature, whereas numerical values have a very high or unlimited number of values, for instance a balance in a bank account. Categorical values can be ordered, for example integers, or unordered, for example of plurality of unrelated option states of nominally similar importance, for example different eye colors. In Table 1 shown in Fig. 1, there is provided an example set of m = 4 data points as an example with n = 3 (variables or dimensions).
The following definitions apply: P = {pi, P2, P3, P4} P = (Xii, xi2, x3}
E X
1�=i�=m;1�=j�=n X1 = {single, married, divorced, . . . } X2 = R(rational numbers) X = R(rational numbers) Example values include: P2 = {rnarried, -988.25, 30) = -75 For the example, X denotes a set of values for a variablej (1 �= j �= n), wherein the values are categorical X = {C1, ... Cjna)} or alternatively numerical X = R(rational numbers). In the example of Table 1, there are four data points denoted by pi, P2, P3, p4 and three variables with value sets X1, X2, X3. Xi contain categorical values denoting marital status, X2 contain numerical values denoting balance numbers, and X3 contain numerical values denoting age.
Each data point therefore represents three values, for example, a data point P2 = {married, - 988.25, 30}.
As elucidated in the foregoing, the present invention employs graphical techniques to assist users to appreciate and understand multidimensional variables. The method involves analyzing data points by creating frequencies of values by binning and counting techniques; value distribution charts are generated from these frequencies of values. Moreover, the method also involves manipulating the value distribution charts to change the intervals, namely bins, for the values. Furthermore, the method involves suitable selection of the intervals as a basis for creating binary expressions. Optionally, the method concerns automatically generating binary expressions from such selections of values as will be elucidated in more detail later.
The method pursuant to the present invention is clearly distinguished from simple visualization of frequencies in a bar chart which constitutes known technology. The method pursuant to the present invention employs: (a) bar charts to visualize value distributions; and (b) from such bar charts selecting intervals of values.
Such visualization in (a) involves analysis processes.
When analyzing data points P, diagrams are created showing frequencies of values of variables, for example bar charts. In preparing such frequencies of values, a distinction is made between aforesaid numerical and categorical values. For numerical values, the method involves creating intervals and counting a number of occurrences of values of variables within the intervals. Conversely, for categorical values, a number of values in each category is counted. When implementing the method for numerical values, there is a need to determine minimum and maximum values, and one or more intervals therebetween.
When implementing the method pursuant to the present invention to create binary expressions, the user is required to select one or more bars for a bar chart to be employed when generating a binary expression. For a bar chart displaying intervals of numerical values, the user is beneficially able to: (a) dynamically change the number of intervals of numerical values; and (b) create one or more "snap-in" values, wherein a "snap-in" value is defined as being a value representing a specific interval border. In certain situations, two "snap-in" values are desirable, for example the user is interested in performing a search amongst a group of people having an age in a range of 14 years to 49 years.
Referring to Fig. 2, there is shown a bar graph of a type employed when implementing the present invention. UD is an abbreviation for "undefined". An abscissa axis 10 denotes ranges of numerical values, and an ordinate axis 20 denotes frequency of occurrence of values within the ranges. In a situation where a method pursuant to the present invention calculates default values for minimum and maximum values, and the number of intervals n is a numerical variable, then borders for the bar graph are given by: b0 = bottom limit b1 = mm b2min+width b3 = mm + (2 * width) b1 = mm + ((n-2) * width)= max b = top limit wherein the width is defined as being (max -min)/(n -2) In other words, the intervals are susceptible to being expressed as: (b0, b1), (b1, b2), (b2, b3), (b1,b) As elucidated earlier, the value for n is susceptible to being dynamically varied, for example in response to user input and selection, and the number of intervals computed accordingly.
When implementing the method, snap-in values are set. Equally-distant intervals are beneficially employed in order to avoid distribution charts appearing visually distorted to users.
EXAMPLE 1: when there is one snap-in value, 5i is the snap-in value. Original minimum and maximum values are denoted by minorg, maxorg respectively. Moreover, flact is used to denote a current number of intervals without outer intervals corresponding to aforementioned (bo, b1) and (b1, b,7) respectively. A culTent interval width is denoted by Wact.
A left distance between the minimum value(min) and snap-in value is denoted by d1 = Si -minorg. Moreover, a right distance between the maximum value (max) and the snap-in value is denoted by dr = maxorg -5i. Pursuant to the method, three values are calculated, namely: = a number of intervals left of the snap-in value, without the interval (bo, bi); flr = a number of intervals right of the snap-in value, without the interval(bi, ba); and w = a new interval width The method computes a current number of intervals right and left of the snap-in point, namely di/wact and di'/Wact respectively. Next lower and next higher integers to di/wact and dr/wact are utilized for ni and n1 respectively. There are therefore four combinations of nj to flr, and the new number of intervals (ni + flr) can be one less, the same, or one higher than the current one.
Equating minnew = si -(ii1 * w) and maxnew 5i + (flr * w), for which a minimization condition pertains wherein: (minnew -minorg)2 + (maxnew -maxorg)2 = (si -(ni * w) -(si -d1))2 + (5i + (flr * w) -(si + dr))2 = (d1 -(fli * w))2 + ((flr * w) -dr)2 for which a minimum for this function is w = ((di * fli) + (dr * /1r))/(fli2 + flr2) wherein a difference of squared values is beneficially employed to consider large deviations of proportionality.
When implementing the method pursuant to the present invention, different values of w are computed for the four combinations of ni to n1.
Finally, the calculated values of w are rounded to a next integer, and then the minimization condition is again applied to determine the best of these values for w.
EXAMPLE 2: when there are two snap-in values Si, S2, such that Si <S2, four values are computed as follows: n1 = number of intervals left of the snap-in value Si, without the interval (bo, bi); = number of intervals between Si and s2; flr = number of intervals right of the snap-in vale S2, without the interval (bi, nb); and w = new interval width.
A computation of a current number of intervals between the two snap-in points Si, S2 IS performed, namely (S2 -si)/wact, taking lower and higher integer values thereof as a candidate for flo.
The width w is then computed from w = (52 -si)/no, resulting in ft = [(si -(minorg/w)] and flr = [(maxorg -(s2/w)]; square brackets are employed here to denote rounding to a next integer value. From fi and r, it is then possible to compute minnew = Si -(fi * w) and maxnew = S2 + (fr * w). An aforementioned minimization condition is employed for determining which candidate is best for no. Referring to Fig. 3, there is shown an example distribution chart for categorical values.
Moreover, referring to Fig. 4, there is shown an example distribution chart for numerical values. Undefined categories are denoted by UD for data points which have a special "null" value.
With reference to the method of the invention, the method involves: (a) user-selecting one or more variables for a computer system to display bar charts of all the one or more selected variables; and (b) user-selecting one or more bars in the displayed bar charts as depicted in Fig. 5 and Fig. 6 showing categorical and numerical values respectively; for example: (MaritalStatus = "Separated" OR MartitalStatus "Single") AND ((Age > 20 AND Age < 40) OR Age >= 70) wherein a logic operator within a chart is OR and between charts is AND.
These steps (a) and (b) enable basic Boolean expressions to be formulated graphically. For expressions of a form: Expression = Minterm1 OR Minterm2 OR wherein each Minterm is generated by selecting one or more bars in one or more bar charts.
A minterm here is a logical conjunction (AND) of binary variables, wherein the binary variable can be complemented or non-complemented; in other words, a minterm comprises only a logical conjunction operator and a complement operator. In contradistinction, a binary expression which is a disjunction, namely OR, of minterms is referred to as being a disjunctive normal form (DNF), which is a canonical form of expressions.
A negation of a minterm is susceptible to being selected graphically by the user by selecting appropriate one or more bars in a bar graph. An expression such as (xi AND x2) OR (x3 AND x4) OR (xs AND x6) is thus susceptible to being easily selected by selecting bars on a bar chart. Negation corresponding to NOT is beneficially easily invoked graphically by selecting a cursor in a checkbox, button or similar graphical control mechanism.
Thus, according to a method pursuant to the present invention, binary expression can be created by the user "on-the-fly" by clicking bar charts for example. Pursuant to the method, a graphical user interface (GUI) allows the user to define one or more separate conditions, each condition being a minterm that is susceptible to being combined via a logical OR function with other minterms. A logical NOT function is susceptible to being activated by the user explicitly from the GUL An example of implementing the method of the invention will now be described. Starting with an empty condition, the user activates one or more bar charts presented on a graphical user interface (GUT), the one or more bar charts representing variables Vi, ... V. In general, the user is desirous to employ a condition "(T(V1) AND T(V2) AND... AND T(V))", wherein T(V) is a term which is generated by selection of bar charts for the variable V1.
In a first step, the user selects or deselects a bar in a bar chart for a categorical variable V. The system removes the associated term T(V) from the condition and its preceding logical AND.
The system then performs the following: (a) if an "others" (0TH) bar is not selected, see Fig. 6, for each selected bar (except UD "undefined"), a simple expression "OR V = "<value>" "is added to the term T(V); (b) if the "others" (0TH) bar is selected, see Fig. 8, for each unselected bar (except UD "undefined), a simple expression "OR V = "<value>" "together with a NOT wrapping is added to the term T(V), namely NOT T(V); and (c) if the "undefined" (UD) bar is selected, see FIGS. 6 and 7, the system appends "OR V IS NULL" to the term T(V), the term T(V) being thereby added back to the condition in amended form together with a preceding logical AND.
In a second step, with reference to Fig. 10, the user selects or deselects a bar in a bar chart for a numerical variable V. The system removes the associated term T(V) from the condition and its preceding logical AND. The system then performs the following for each set of adjacent selected bars: (a) if a first bar is selected in a set of bars, adding a simple expression "OR V <[right-border-value]" to the term T(V); (b) if a last bar is selected from the set of bars, adding a simple expression "OR V >= [left-border-value]" to the term T(V); (c) if(a), (b) and (d) do not pertain, adding a simple expression "OR (V >= [left-border-value] AND V < [right-border-value])" to the term T(V); and (d) if(a), (b) and (c) do not pertain in an event of an undefined (UD) bar being selected, appending a term "OR V IS NULL" to the term T(V), the term T(V) being thereby added back to the condition in amended form together with a preceding logical AND.
The condition, namely expression, amended by the first and second step can then be used for performing a complex search in amongst a multidimensional set of data, for example for extracting information, for implementing a decision, for controlling a system or some other action having technical effect.
By way of example, Fig. 10 and Fig. 11 show example which realize the user-graphical-interface part of the invention. Such a user graphical interface can be used generally in all database tools that incorporate the functionality to filter rows from database tables. The filter result can be displayed, stored, or processed further in such tools. A concrete example for a tool that needs to filter rows and profits from the invention is: any tool which needs to create conditions to preprocess data while filtering rows, for example as arises in ETL-Tools and Data-Mining tools.
Suppose a large department store wants to perform a mailing campaign and wants to use a data mining tool to figure out which are the best suitable customers for the campaign. A database table with all customers and information about them is available. However, the mailing campaign should not target all users, because it's a campaign for women shoes.
Therefore, before the mining functions are used to predict the likelihood for each customer to respond to the mailing campaign, the set of customers has to be reduced according to the campaign's goal. For instance, it is desired to select all women who spent at least a certain amount of money to buy shoes in the last three month. This is the customer group that is fed into the mining algorithm which orders them according to their likelihood to respond to the campaign. To determine the exact values for the condition, e.g. what is the money limit that should be used, should it be looked into the last three months or another period, the distributions of the variables are examined and the invention allows at the same time to specifiy these values and generate the SQL expression that is used to filter the database table.
Today's mining tools allow specifiing the parameters for the mining process itself interactively, via point-and-click (e.g. the source data table, mining algorithm selection), but for data filtering the user still has to write the corresponding expression, with the invention this can now also be done in a point-and-click way. Of course, complex expressions that for instance involve special mathematical functions cannot be specified that way.
In order to further describe embodiments of the present invention, its implementation via the aforesaid graphical user interface (GUI) will now be elucidated. SQL filtering expressions are susceptible to being generated for one or more database tables. In Fig. 9, there is shown a screen of a graphical user interface (GUI) providing the user with information of available database tables together with SQL conditions. The user is desirous using the screen to add or edit a condition. On a left-hand-side of the screen, the user can select a table of interest for which the user is desirous to add or edit a condition.
On a right-hand-side of the screen, there are shown upper and lower areas. In the top area, the user is able to select, using a curser or similar, symbols invoking selections corresponding to "meeting at least one of these conditions" or "none of these conditions"; "none of these conditions" corresponds to the aforementioned logical NOT operation. The bottom area shows all conditions that have been defined for the selected table. A "name" column provides information regarding a name which is allocated to the condition. This, the condition column displays corresponding SQL conditions.
The user is capable of adding, editing and deleting conditions via toolbar icons provided on the screen as illustrated in Fig. 10. n the screen of Fig. 10, there is presented a dialogue defined in three areas. A first upper area presents rows with column names of a table together with statistical information. The user is able to select one or more rows from the first area. For each selected row in the first area, a data distribution is presented in a second lower region of the screen. The second area consists of a toolbar, display setting and bar charts. In Fig. 10, there are shown two example bar charts in the second area, namely gender and age in this example, namely categorical and numerical variables presented in a graphical form. The user is able to select one or more bars in the bar charts. When selected, the selected bars beneficially change color, for example to a white color.
The user is able to interact with each bar directly for controlling search conditions, namely expressions. By selecting one or more bars, modified conditions, namely expressions, are added or modified and displayed at a bottom region of the screen.
In SQL, binary expressions are, as aforementioned, called conditions and specified in a WHERE clause and a SELECT statement, for example (SELECT * FROM <table> WHERE <condition>). Contemporary known SQL tools provide different support when creating such search and filtering conditions: (i) Case 1: a complete condition has to be entered manually as text without user assistance, such that no syntactical checks are done; (ii) Case 2: a complete condition has to be entered manually as text, but a system syntax check is performed and syntax errors and warnings are provided to the user; (iii) Case 3: a condition has to be entered manually, but there is some help to enter keywords of the expression language, variable names, mathematical operators or function names. Syntax checks are additionally performed.
(iv) Case 4: A development from case 3, with additional support to enter constants inside expressions, wherein functions of scanning data sets and providing candidate values for a constant are accommodated. An example of case 4 is a proprietary DB2 SQL Assist product wherein the user thereof is forced to enter terms of an expression in a proper sequence; (v) Case 5: a condition is entered as a sequence of terms that is controlled by a syntax-control editor implemented in software. Such control prevents syntax errors in search expression on account of a requirement that a sequence of terms is always a valid sequence; and (vi) Case 6: a condition is displayed as a graphical tree, wherein tree nodes denote logical operators, therefore preventing syntax errors from being made. An example of case 6 is the proprietary contemporary DB Visualizer Query Builder.
None of cases 1 to 6 assist the user to define a condition or expression for performing searching and/or filtering of one or more datasets, but merely checks in some situations whether or not associated expression of condition syntax is correct. In contradistinction, the present invention concerns a graphical method to assist the user define the conditions for searching data, for subsequently taking a decision have technical effect, for example.
The present invention assists the user address three problems: (a) to understand data for describing datasets to be filtered in natural language; (b) to appreciate logical syntax and semantics of expression language of software, for example SQL expressions; and (c) to map a natural language description to the expression language of the software.
The problems is achieved by employing a graphical representation of complex multidimensional datasets, and providing the user with a tool for manipulating search and/or filter conditions, namely expressions, in respect of the graphical representation. The search conditions for filtering, or otherwise manipulating, the data sets are susceptible to being used to guide further actions, for examples further actions having technical effect in technically controlling systems and apparatus as will be elucidated in greater detail later.
The present invention is susceptible to being implemented in a form of software recorded on a machine-readable data carrier, wherein the software is executable on computing hardware for implementing methods pursuant to the present invention; the methods concern providing a graphic representation of one or more datasets to users, the users being able via the method to specifiy search and/or filtering conditions in respect of the graphical representation. Whereas earlier filter and search methods have provided syntactical checking of search and filter conditions, these earlier methods have not provided an easily useable graphical interface for users as aforementioned in respect of the present invention.
The invention can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment containing both hardware and software elements. In a preferred embodiment, the invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc. Furthermore, the invention can take the form of a computer program product accessible from a computer-usable or computer readable medium providing program code for use by or in connection with a computer or any instruction execution system. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can contain, store, communicate, propagate, or transport the program for use by on in connection with the instruction execution system, apparatus, or device.
The medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk. Current examples of optical disks include compact disk-read-only memory (CD-ROM), compact disk-read/write (CD-R/W) and DVD.
A data processing system suitable for storing and/or executing program code will include at least one processor coupled directly or indirectly to memory elements through a system bus.
The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
input/output or T/O-devices (including, but not limited to keyboards, displays, pointing devices, etc.) can be coupled to the system either directly or through intervening I/O controllers.
Network adapters may also be coupled to the system to enable the data processing system or remote printers or storage devices through intervening private or public networks. Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters.
Referring now to Fig. 12, the aforementioned method pursuant to the present invention is susceptible to being employed in a data processing system including a computer 50 provided with: (a) a data processor 60; (b) a first electronic memory 70 for storing software recorded on a machine-readable data carrier, the software being executable on the data processor 60 for implementing one or more methods pursuant to the present invention; (c) a second electronic memory 80 for storing one or more datasets to be at least one of searched and filtered pursuant to the one or more methods; and (d) a display 100 including a screen 110 and data entry hardware 120, the screen 110 being operable to present a user graphical interface for enabling a user 130 to interact with the software executing upon the data processor 60.
Optionally, the first and second data memories 70, 80 are implemented as two portions of a same general data memory.
Optionally the computer 50 is employed to control or direct operation of a technical apparatus or technical system denoted by 150. Optionally, the apparatus or system 150 includes sensors operable to generate data 160 for recording into the second data memory 80 for at least one of subsequently searching or filtering pursuant to one or more methods of the present invention as described in the foregoing; conditions, namely expressions, derived by using the user graphical interface are employed by the data processor 60 to process the one or more datasets recorded in the second memory 80. Based upon at least one of searching and filtering the one or more datasets recorded in the second memory 80, the computer 50 is operable to issue one or more commands denoted by 170 for controlling the apparatus or system 150.
The apparatus or system 150 is beneficially a financial stock exchange, a petroleum platform, a solar renewable energy system, a traffic control system, a financial services website and similar wherein considerable quantities of data need to be comprehended and considered rapidly and accurately by suitable filtering and/or searching operations for controlling and directing the apparatus or system 150. The present invention is applicable to processing the considerable quantities of data.
Modifications to embodiments of the invention described in the foregoing are possible without departing from the scope of the invention as defined by the accompanying claims.
Expressions such as "including", "comprising", "incorporating", "consisting of", "have", "is" used to describe and claim the present invention are intended to be construed in a non-exclusive manner, namely allowing for items, components or elements not explicitly described also to be present. Reference to the singular is also to be construed to relate to the plural.
Numerals included within parentheses in the accompanying claims are intended to assist understanding of the claims and should not be construed in any way to limit subject matter claimed by these claims.

Claims (14)

  1. CLAIMS1. A method of generating logical expressions for use in at least one of filtering and searching one or more datasets (80) using computing hardware (60) operable to provide a graphical user interface (110), wherein said method includes: (a) representing the one or more datasets in a graphical form on the graphical user interface (110); (b) facilitating user-manipulation of the graphical form to identify one or more portions of the one or more datasets (80) to define and thereby generate the logical expressions; and (c) applying the expressions to the one or more datasets (80) for performing at least one of filtering and searching operation of the one or more datasets (80).
  2. 2. The method according to claim 1, including representing the one or more datasets in the graphical form on the graphical user interface (110) as representations of numerical variables and/or categorical variables.
  3. 3. The method according to claim 1 or 2, wherein the graphical form includes one or more graphs whose one or more graphical features are user-selectable for defining the logical expressions.
  4. 4. The method according to claim 3, including dynamically interactively modifying a number of graphical features presented to the user in response to one or more parameters input by the user via the graphical interface.
  5. 5. The method according to claim 1, including expressing the logical expressions as a sequence of terms linked by logical AND operators, each term being definable using the numerical and/or categorical variable expressions via the graphical user interface by user-manipulation of the graphical form.
  6. 6. The method according to claim 1, wherein the method includes forming the expressions via the graphical user interface to include a combination of categorical and numerical variable terms.
  7. 7. The method according to claim 1, including providing the user via the graphical interface with a choice of logical expressions which are susceptible to being applied for at least one of filtering and search the one or more datasets.
  8. 8. The method according to claim 3, wherein the one or more graphical features are representative of one or more frequencies of occurrences of variables within one or more numerical limits applied to the one or more datasets.
  9. 9. The method according to claim 3, wherein the one or more graphical features are representative of one or more frequencies of occurrences of variables within one or more categorical groups.
  10. 10. A data processing system (50, 100) including computing hardware (60) for generating logical expressions for use in at least one of filtering and searching one or more datasets (80), the computing hardware (60) being operable to provide a graphical user interface (110), the system being operable: (a) to represent the one or more datasets (80) in a graphical form on the graphical user interface (110); (b) to facilitate user-manipulation of the graphical form to identify one or more portions of the one or more datasets (80) to define and thereby generate the logical expressions; and (c) to apply the expressions to the one or more datasets (80) for performing at least one of filtering and searching operation of the one or more datasets (80).
  11. 11. A data processing program (70) for execution in a data processing system (50, 100) comprising software code portions for performing a method when said program is run on a computer, wherein the method steps comprise (a) representing the one or more datasets (80) in a graphical form on the graphical user interface (110); (b) facilitating user-manipulation of the graphical form to identify one or more portions of the one or more datasets (80) to define and thereby generate the logical expressions; and (c) applying the expressions to the one or more datasets (80) for performing at least one of filtering and searching operation of the one or more datasets (80).
  12. 12. A data processing program (70) recorded on a machine-readable data carrier, the program (70) being executable on computing hardware (60) for implementing a method as claimed in anyone to the claims 1 to 9.
  13. 13. A computer program product comprising a computer useable medium including a computer readable program, wherein the computer readable program when executed on a computer causes the computer to (a) represent the one or more datasets (80) in a graphical form on the graphical user interface (110); (b) facilitate user-manipulation of the graphical form to identify one or more portions of the one or more datasets (80) to define and thereby generate the logical expressions; and (c) apply the expressions to the one or more datasets (80) for performing at least one of filtering and searching operation of the one or more datasets (80).
  14. 14. A computer program product comprising a computer useable medium including a computer readable program, wherein the computer readable program when executed on a computer causes the computer to perform the method according to any one of the claims 1-9.
GB0919723A 2008-12-17 2009-11-11 Method of graphically creating binary expressions Withdrawn GB2466341A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
EP08171901 2008-12-17

Publications (2)

Publication Number Publication Date
GB0919723D0 GB0919723D0 (en) 2009-12-30
GB2466341A true GB2466341A (en) 2010-06-23

Family

ID=41509183

Family Applications (1)

Application Number Title Priority Date Filing Date
GB0919723A Withdrawn GB2466341A (en) 2008-12-17 2009-11-11 Method of graphically creating binary expressions

Country Status (1)

Country Link
GB (1) GB2466341A (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2482966A (en) * 2010-08-17 2012-02-22 Fujitsu Ltd Representing sensor samples as a characteristic function of minterms
US8495038B2 (en) 2010-08-17 2013-07-23 Fujitsu Limited Validating sensor data represented by characteristic functions
US8572146B2 (en) 2010-08-17 2013-10-29 Fujitsu Limited Comparing data samples represented by characteristic functions
US8583718B2 (en) 2010-08-17 2013-11-12 Fujitsu Limited Comparing boolean functions representing sensor data
US8620854B2 (en) 2011-09-23 2013-12-31 Fujitsu Limited Annotating medical binary decision diagrams with health state information
US8645108B2 (en) 2010-08-17 2014-02-04 Fujitsu Limited Annotating binary decision diagrams representing sensor data
US8719214B2 (en) 2011-09-23 2014-05-06 Fujitsu Limited Combining medical binary decision diagrams for analysis optimization
US8781995B2 (en) 2011-09-23 2014-07-15 Fujitsu Limited Range queries in binary decision diagrams
US8812943B2 (en) 2011-09-23 2014-08-19 Fujitsu Limited Detecting data corruption in medical binary decision diagrams using hashing techniques
US8838523B2 (en) 2011-09-23 2014-09-16 Fujitsu Limited Compression threshold analysis of binary decision diagrams
US8909592B2 (en) 2011-09-23 2014-12-09 Fujitsu Limited Combining medical binary decision diagrams to determine data correlations
US8930394B2 (en) 2010-08-17 2015-01-06 Fujitsu Limited Querying sensor data stored as binary decision diagrams
US9002781B2 (en) 2010-08-17 2015-04-07 Fujitsu Limited Annotating environmental data represented by characteristic functions
US9075908B2 (en) 2011-09-23 2015-07-07 Fujitsu Limited Partitioning medical binary decision diagrams for size optimization
US9138143B2 (en) 2010-08-17 2015-09-22 Fujitsu Limited Annotating medical data represented by characteristic functions
US9177247B2 (en) 2011-09-23 2015-11-03 Fujitsu Limited Partitioning medical binary decision diagrams for analysis optimization
US9176819B2 (en) 2011-09-23 2015-11-03 Fujitsu Limited Detecting sensor malfunctions using compression analysis of binary decision diagrams

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5265246A (en) * 1990-12-10 1993-11-23 International Business Machines Corporation Graphic definition of range in the selection of data from a database field
EP0627691A1 (en) * 1993-06-04 1994-12-07 International Business Machines Corporation Method and system for searching a database utilizing a graphical user interface
US5734888A (en) * 1993-06-04 1998-03-31 International Business Machines Corporation Apparatus and method of modifying a database query
US5894311A (en) * 1995-08-08 1999-04-13 Jerry Jackson Associates Ltd. Computer-based visual data evaluation
US5966126A (en) * 1996-12-23 1999-10-12 Szabo; Andrew J. Graphic user interface for database system
EP1126384A2 (en) * 1992-11-06 2001-08-22 Ncr International Inc. Data analysis apparatus and methods
US20050125399A1 (en) * 2003-10-28 2005-06-09 Scott Ireland Method, system, and computer program product for constructing a query with a graphical user interface

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5265246A (en) * 1990-12-10 1993-11-23 International Business Machines Corporation Graphic definition of range in the selection of data from a database field
EP1126384A2 (en) * 1992-11-06 2001-08-22 Ncr International Inc. Data analysis apparatus and methods
EP0627691A1 (en) * 1993-06-04 1994-12-07 International Business Machines Corporation Method and system for searching a database utilizing a graphical user interface
US5734888A (en) * 1993-06-04 1998-03-31 International Business Machines Corporation Apparatus and method of modifying a database query
US5894311A (en) * 1995-08-08 1999-04-13 Jerry Jackson Associates Ltd. Computer-based visual data evaluation
US5966126A (en) * 1996-12-23 1999-10-12 Szabo; Andrew J. Graphic user interface for database system
US20050125399A1 (en) * 2003-10-28 2005-06-09 Scott Ireland Method, system, and computer program product for constructing a query with a graphical user interface

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8874607B2 (en) 2010-08-17 2014-10-28 Fujitsu Limited Representing sensor data as binary decision diagrams
US8495038B2 (en) 2010-08-17 2013-07-23 Fujitsu Limited Validating sensor data represented by characteristic functions
US8572146B2 (en) 2010-08-17 2013-10-29 Fujitsu Limited Comparing data samples represented by characteristic functions
US8583718B2 (en) 2010-08-17 2013-11-12 Fujitsu Limited Comparing boolean functions representing sensor data
GB2482966A (en) * 2010-08-17 2012-02-22 Fujitsu Ltd Representing sensor samples as a characteristic function of minterms
US8645108B2 (en) 2010-08-17 2014-02-04 Fujitsu Limited Annotating binary decision diagrams representing sensor data
US9138143B2 (en) 2010-08-17 2015-09-22 Fujitsu Limited Annotating medical data represented by characteristic functions
US9002781B2 (en) 2010-08-17 2015-04-07 Fujitsu Limited Annotating environmental data represented by characteristic functions
US8930394B2 (en) 2010-08-17 2015-01-06 Fujitsu Limited Querying sensor data stored as binary decision diagrams
US8620854B2 (en) 2011-09-23 2013-12-31 Fujitsu Limited Annotating medical binary decision diagrams with health state information
US8838523B2 (en) 2011-09-23 2014-09-16 Fujitsu Limited Compression threshold analysis of binary decision diagrams
US8909592B2 (en) 2011-09-23 2014-12-09 Fujitsu Limited Combining medical binary decision diagrams to determine data correlations
US8812943B2 (en) 2011-09-23 2014-08-19 Fujitsu Limited Detecting data corruption in medical binary decision diagrams using hashing techniques
US8781995B2 (en) 2011-09-23 2014-07-15 Fujitsu Limited Range queries in binary decision diagrams
US9075908B2 (en) 2011-09-23 2015-07-07 Fujitsu Limited Partitioning medical binary decision diagrams for size optimization
US8719214B2 (en) 2011-09-23 2014-05-06 Fujitsu Limited Combining medical binary decision diagrams for analysis optimization
US9177247B2 (en) 2011-09-23 2015-11-03 Fujitsu Limited Partitioning medical binary decision diagrams for analysis optimization
US9176819B2 (en) 2011-09-23 2015-11-03 Fujitsu Limited Detecting sensor malfunctions using compression analysis of binary decision diagrams

Also Published As

Publication number Publication date
GB0919723D0 (en) 2009-12-30

Similar Documents

Publication Publication Date Title
GB2466341A (en) Method of graphically creating binary expressions
US12314674B2 (en) Applied artificial intelligence technology for narrative generation based on a conditional outcome framework
US11681694B2 (en) Systems and methods for grouping and enriching data items accessed from one or more databases for presentation in a user interface
US20250036888A1 (en) Applied Artificial Intelligence Technology for Performing Natural Language Generation (NLG) Using Composable Communication Goals and Ontologies to Generate Narrative Stories
US11853363B2 (en) Data preparation using semantic roles
US10699079B1 (en) Applied artificial intelligence technology for narrative generation based on analysis communication goals
US10902045B2 (en) Natural language interface for building data visualizations, including cascading edits to filter expressions
US10565232B2 (en) Constructing queries for execution over multi-dimensional data structures
Bentler et al. EQS 6.1 for Windows
US6831668B2 (en) Analytical reporting on top of multidimensional data model
US7383513B2 (en) Graphical condition builder for facilitating database queries
US20200089760A1 (en) Analyzing Natural Language Expressions in a Data Visualization User Interface
US20070260582A1 (en) Method and System for Visual Query Construction and Representation
US20090024940A1 (en) Systems And Methods For Generating A Database Query Using A Graphical User Interface
US20040015481A1 (en) Patent data mining
Bakke et al. Expressive query construction through direct manipulation of nested relational results
KR20060048768A (en) Easy-to-use data context filtering
US20250342178A1 (en) Query Semantics for Multi-Fact Data Model Analysis Using Shared Dimensions
Seamark Beginning DAX with Power BI: The SQL Pro’s Guide to Better Business Intelligence
US20060020608A1 (en) Cube update tool
US20080082493A1 (en) Apparatus and method for receiving a report
Seamark Beginning DAX with Power BI
WO2024233578A2 (en) Creation and consumption of data models that span multiple sets of facts
Seamark Introduction to DAX
Saltin Interactive visualization of financial data: development of a visual data mining tool

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)