[go: up one dir, main page]

WO1989003567A1 - Base de donnees relationnelles utilisant des identificateurs - Google Patents

Base de donnees relationnelles utilisant des identificateurs Download PDF

Info

Publication number
WO1989003567A1
WO1989003567A1 PCT/US1988/003477 US8803477W WO8903567A1 WO 1989003567 A1 WO1989003567 A1 WO 1989003567A1 US 8803477 W US8803477 W US 8803477W WO 8903567 A1 WO8903567 A1 WO 8903567A1
Authority
WO
WIPO (PCT)
Prior art keywords
row
relation
values
block
column
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US1988/003477
Other languages
English (en)
Inventor
Douglas Wyche Caldwell
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nucleus International Corp
Original Assignee
Nucleus International Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nucleus International Corp filed Critical Nucleus International Corp
Publication of WO1989003567A1 publication Critical patent/WO1989003567A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases

Definitions

  • the present application relates generally to a computerized database system and more particularly to a method and apparatus for utilizing identifiers in a relational database for preserving referential integrity.
  • null values The reason for permitting null values is that some relationships may exist, but the relevant data is unavailable or unknown.
  • a primary key serves to uniquely identify a particular "relational entity” within a set of such entities.
  • a “relational entity” is defined as all or part of a row of a table (i.e., relation).
  • the two referential integrity rules may be stated as:
  • No relational entity may include either: a. a potentially ambiguous reference to another relational entity, or b. a reference to a non-existent or unknown relational entity.
  • a database containing bank account information in one table and transaction information in another table may not be able to correlate all transactions (in which the account numbers of the debtor and of the creditor are foreign keys) with the corresponding information for individual accounts (in which the account number is a primary key which uniquely identifies the account) if an individual account number is directed to the wrong account.
  • the database system will not be able to generate accurate account statements for all the individuals who participated in the various transactions and will most likely have some transactions for which either the necessary debit or credit account information is unavailable.
  • Referential integrity can be maintained in a conventional relational database if, every time data is entered into the database, there is a check performed by the system to determine that:
  • the new- key value duplicates an existing key value; and (3) whenever a value is updated or deleted which includes a primary key, the old key value corresponds to that of any existing value of any associated foreign key and, if so, updating all foreign keys corresponding to the changed key or setting to "null" (no value) all foreign keys corresponding to the deleted key.
  • one aspect of the preferred embodiment of "the present invention includes an apparatus and method, utilizing a computer for representing a relational database.
  • the relational database contains a plurality of relations, each containing one or more columns and rows.
  • a column has one or more values which all have a common characteristic.
  • Each value in the column corresponds to one of the rows of the relation.
  • Each row contains one or more values, a d each value is from a different column.
  • the present invention overcomes the shortcomings in the prior art by replacing each occurrence of the same value (or a related set of values) in a column with an "identifier" or “token” (a symbol that represents and is unique to a particular value by reference to the value in an underlying value set) .
  • the values for all columns which share a common characteristic are contained in one value set.
  • Each identifier may then be set equal to the ordinal position of each value in the value set.
  • the individual rows of a relation may also be individually identified by an identifier, so that one or more related values may be represented with a single "identifier" which references corresponding values of another row in the same or a different relation.
  • the present invention greatly simplifies the database design, data entry, and data query processes.
  • constructing a database in accordance with the present invention there is no need for any a priori definition of whic columns are to serve as primary keys; moreover, since an identifier provides an explicit reference to the intended relational entity, there is no need to be concerned whether a change to any existing values associated with a particular entity will cause a reference to become ambiguous, even if the change results in a primary key to cease to uniquely identify that particular relational entity.
  • identifiers in place of values and groups of related values in rows, it is possible to implement a relational database which maintains referential integrity.
  • the system can automatically check whether a newly added entity is an exact duplicate of an existing entity, without requiring any further checking of any added or changed key data or any comparisons with other associated key data in the same or a different table in the database.
  • FIG. 1 depicts three related relations or tables of a relational database as perceived by the user;
  • FIG. 2 depicts the values in six domains of the database;
  • FIG. 3 depicts a way the data depicted in FIG. 1 may be stored after processing the data according to the aspect of the present invention
  • FIG. 4 depicts certain changes to the tables of FIG. 1 as perceived by the user
  • FIG. 5 depicts the effects of the changes to FIG. 4 on the identifiers of FIG. 3;
  • FIG. 6 (comprising FIGS. 6(a), 6(b) and 6(c), corresponding respectively to FIGS. 3(a), 3(b) and 3(c)) depicts an improved representation of the data of FIG. 3, whereby referential integrity of the data is preserved;
  • FIG. 7 depicts a computer system equipped with a Relational Database Management System (RDMS) for creating a relational database with identifiers in accordance with the present invention
  • RDMS Relational Database Management System
  • FIG. 8 is a flow diagram of the BUILD DATABASE routine
  • FIG. 9 is a flow diagram of the CREATE DATABASE routine
  • FIG. 10 is a flow diagram of the LOAD DATABASE routine.
  • FIG. 11 (comprising FIGS. 11(a), 11(b), 11(c) and 11(d)) is a results table depicting the construction of the Persons relation with identifiers as shown in FIG. 6(a).
  • FIG. 12 (comprising FIGS. 12(a) and 12(b)) is a results table depicting the construction of the Employee relation with identifiers as shown in FIG. 6(b).
  • FIG. 13 is a results table depicting the construction of the Manager relation with identifiers as shown in FIG. 6(c).
  • FIGS. 1(a), 1(b) and 1(c) it may be seen that a typical relational database comprises several relations, each of which being representable as a table 12, 14, 16.
  • PERSON table 12 it may be seen that the table contains a number of characteristics (such as names and dates) .
  • FIG. 1 shows the data as it might be logically stored in a prior art relational database system. More specifically, each row 18(a), 18(b), . . .
  • each column 20(a), 20(b), . . . 20(i) corresponds to a particular characteristic.
  • FIGS. 1(a), 1(b) and 1(c) a description of the relational database containing the PERSON table, EMPLOYEE table and MANAGER table, is discussed.
  • the PERSON table at row 18(b) the entry for Mary Smith having birth date 3/14/39 is indicated at 9.
  • Mary Smith is also shown in the MOTHER columns of the PERSON table at row 18(i), as indicated at 11.
  • the dotted line 19 depicts the relationship of Mary Smith at row 18(b) with Mary Smith at row 18(i).
  • Mary Smith is married to Sam Jones, born 9/9/30 and is the mother to John Jones, born 3/5/62.
  • a different Mary Smith is listed.
  • Mary Smith at row 18(k) has a birth date of 5/8/35 which distinguishes this Mary Smith from the Mary Smith at row 18(b) .
  • Mary Smith at row 18(k) in the PERSON table is also the same Mary Smith in the EMPLOYEE table (FIG. 1(b)) at row 62(c).
  • Mary Smith is an employee of a particular company in the security department and makes $43,000.00 per year.
  • the relational database is set up so that each table, FIGS. 1(a), 1(b) and 1(c), are interdependent.
  • the EMPLOYEE table, FIG. 1(b) is directly dependent upon the PERSON table, FIG. 1(a), for its information.
  • the MANAGER table is directly dependent upon the EMPLOYEE table of FIG. 1(b).
  • Mary Smith of row 41(a) of the MANAGER table can only relate back to Mary Smith at row 42(c) of the EMPLOYEE table.
  • Mary Smith of the EMPLOYEE table relates back to Mary Smith in the PERSON table.
  • Other tables can be formed, such as the softball team or volleyball team, which may be dependent upon the PERSON table for its entries.
  • the relational database administrator sets up the database so that information from one table can be unambiguously referenced in another table.
  • PERSON-LAST 20(a), FATHER-LAST 20(f) and MOTHER-LAST 20(g) are all surnames.
  • the set of all possible values that a column or a number of related columns can contain is called a domain. It is often advantageous to consider all last names as coming from a single, separately defined domain.
  • the values which occur in the relational database are a subset of the domain of values, and the subset is referred to as a value set.
  • At least those values which share such a common characteristic and which are already in existence are arbitrarily ordered (e.g. , in the order they were originally entered into the system) to form an ordered value set.
  • the ordered set of Last Names 28 (FIG. 2) (which is associated with columns 20(a), 20(d), 20(g), 38(a) and 40(a) of FIG. 1(a)) is depicted as ⁇ being associated with respective unique identifiers 30 (FIG. 2) .
  • a non-null identifier to each unique (possibly null) value in a particular domain may be used by a relation having a column defined on that domain. Note that Jones 26 of the Last Name domain 28 has the identifier "1" associated with it.
  • each value in the Last Name column in the database of FIG. 1 is replaced by a unique identifier in FIG. 3.
  • unique identifier For example, for each "JONES" 26 there is associated the unique identifier "1", as indicated at 32 (FIG. 3(a)).
  • a similar procedure has been used to form the First Name value set 34 and its associated set of identifiers 30 illustrated in FIG. 2 , which replaces the value in the First Name column in the database of FIG. 1 by the unique identifiers as shown in FIG. 3.
  • a more detailed discussion on the process for building the relational database with identifiers is given below in section G.
  • the value sets 28 and 34 are respectively the sets of all values of last and first names actually used by all the related columns (e.g., columns 20(a), 20(d), 20(g), 38(a) and 40(a) of FIGS. 1 (a), 1(b) and 1(c) are associated with the ordered set of Last Names 28) , they must constantly change to reflect the current contents of those columns.
  • One straightforward implementation is to keep on adding values without deleting values no longer in use.
  • the value set could be allowed to shrink as well as grow, in which case it is advantageous to introduce a second type of identifier (not shown in FIGS.) to track the ordering of identifiers, since it would be very time consuming to renumber all the identifiers in the entire system whenever a particular value ceases to be in use.
  • variable length alphanumeric entries may now be represented by fixed length integers (the only requirement is that the system be able to accommodate integer values as large as the maximum number of unique values in any one value set) .
  • the present invention readily solves the foregoing problem of referential integrity by extending the use of identifiers so that each relational entity of a relation may reference another relational entity (or row) of the same or a different relation.
  • FIG. 6, comprising FIGS. 6(a) through 6(c), corresponds to FIGS. 1, 3, .4 and 5, but is modified to include row (relational entity) identifiers, wherein three related columns (e.g., Father's First and Last Names and birth date) have been replaced with a single column "FATHER" 68 whose entries reference the relational entities of the PERSON relation 12"' containing the Father column data.
  • the ordinal row numbers serve as the identifiers.
  • FIGS. 1(a), 1(b), 1(c) and 6(a), 6(b) and 6(c) a second version of the relational database using identifiers is shown.
  • the reference Mary Smith having a birth of 3/14/39 is shown linked to Mary Smith having a birth of 3/14/39 by dotted line 19.
  • Mary Smith is John- Jones' mother at row 18(i) of the PERSON table.
  • the PERSON table, as represented in FIG. 6(a) is also shown as linking Mary Smith (02 02 02) to Mary Smith (02) in row 18(i) of the PERSON relation 12"' by the dotted line 19.
  • a standard programming language such as SQL, may be used to define the structure of a relational database.
  • SQL a standard programming language
  • a JOIN is a SELECT over the Cartesian product of at least two relations of the relational database.
  • the JOIN operation is used to combine relations based on equivalent values in columns having the same characteristic.
  • JOIN involves combining each of N rows of the first relation (e.g., the PERSON relation, FIG. 1(a)), with each of M rows of a second relation (e.g., the EMPLOYEE relation (FIG. 1(b)), to form an N x M resultant relation.
  • the JOIN operation describes all resultant rows of a JOIN relation which do not have a match in the JOIN column.
  • This type of JOIN relation is generally referred to as the "EQUIJOIN" operation.
  • the JOIN operation can be performed more efficiently as a SELECT operation.
  • a JOIN of the PERSON table and EMPLOYEE table where PERSON-NAME is equal to EMPLOYEE-NAME is performed by selecting all identifiers in the EMPLOYEE table in the Employee Name column which correspond to row identifiers in the PERSON table.
  • identifiers 9, 10, 11 and 12 correspond to the rows having identifiers 9, 10, 11 and 12 of the PERSON table.
  • the SELECT operation over the EMPLOYEE table of FIG. 6(b) will automatically join the equivalent entries of the PERSON table of FIG. 6(a). This is a direct result of separating the functions of binding and information representation by the use of identifiers. Because the binding is already explicit, less work needs to be performed.
  • FIG. 7 a computer system having a programmable computer and computer programs for creating a relational database with identifiers is shown.
  • the system includes program computer 103, display 101, keyboard 107 for the computer and external device 109 for storage of data.
  • Hardware and software for creating a relation having identifiers is stored in the Relational Database Management System (RDMS) 105 (shown in phantom lines) , which is connected within the computer 103.
  • the RDMS 105 coordinates the various activities related to adding and implementing identifiers to a relational database.
  • RDMS Relational Database Management System
  • External device 109 is a permanent or buffered storage, typically in the form of a hard disk, for storing relations which are stored in expanded form where each value is no smaller than a byte.
  • the contents of the external device are typically maintained in records which are divided into fields where the Nth field of each record corresponds to a specific type of data.
  • the contents of the storage device 109 are loaded to a relational processor unit (RPU) in the RDMS 105.
  • RPU relational processor unit
  • the operation of the bus system and the operation of the components of the RDMS 105 are controlled by routines implemented by the RPU. When the routines are loaded into the RPU, coordination and processing of the various components of the RDMS 105 is established.
  • the BUILD process for building a relational database having identifiers is shown. More particularly, the BUILD DATABASE routine (151, FIG. 8) coordinates the CREATE phase (FIG. 9) and the LOAD phase (FIG. 10) to be discussed. During block 153 of the BUILD DATABASE routine (FIG. 8) , the CREATE DATABASE routine (FIG. 9) is called for building a "framework" for the relational database. During this routine, the framework for maintaining the relational database having identifiers is created. During block 154, the first relation to be inserted into the relational framework is selected.
  • a row of the relation from the load file stored in external device 109 is brought over to the RPU of the RDMS 105.
  • the INSERT ROW routine (FIG. 10) is called to " coordinate the process of assigning, identifiers to the incoming row.
  • the RPU determines whether there are any more rows of the selected relation left in the load file. Assuming that there are more rows remaining in the relation in the load file, processing continues at blocks 155, 152 and 161 until all of the rows of the particular relation are inserted. Assuming that all of the rows of the relation in the load file have been processed and inserted into the database, block 165 is processed.
  • FIG. 9 is a flow diagram of the CREATE DATABASE routine for building a "framework" for the individual domains of ordered value sets and the frameworks for the relations specified by the user to be included in the relational database.
  • the relational database as depicted in FIG. 9 is a flow diagram of the CREATE DATABASE routine for building a "framework" for the individual domains of ordered value sets and the frameworks for the relations specified by the user to be included in the relational database.
  • FIG. 1 and the domains as shown in FIG. 2 are an example of a relational database whose framework may be created by the CREATE DATABASE routine (FIG. 9) .
  • a domain framework is constructed.
  • the framework includes user supplied specifications concerning the characteristics of the domain such as the name, data type, value, size and/or other user specified restrictions on the values it may contain.
  • the RPU determines whether there are other domains which need frameworks to be built. Assuming that there are other domain frameworks to be constructed, processing continues at blocks 102 and 104 until all of the domain frameworks have been built. When all of the domain frameworks have been built, processing continues at block 106.
  • the framework for a relation is built.
  • a framework for a relational column is constructed.
  • a check is made on whether the particular column whose framework was just built draws its values from a domain or another relation, as specified by the administrator or user. Assuming that the column draws it values from a domain, processing proceeds at block 116 during which a reference to the associated domain is included in the framework for the column and processing continues at block 120.
  • processing continues at block 118.
  • a reference to the particular relation is included in the column framework to indicate which relation the column draws its values from. The processing continues at block 120.
  • the RPU determines whether there are any other columns which need the framework to be built for the current relation. Assuming that there are still more columns which need a framework to be constructed, processing continues at blocks 108, 110, 116, 118 and 120 until all columns of the relation have frameworks constructed. Assuming that all of the column frameworks for a particular relation are built, processing continues at block 124. During block 124, the RPU determines whether there are other relations which require frameworks to be constructed. Assuming that there are other relations which require frameworks to be constructed, processing continues at blocks 106, 108, 110, 116, 118, 120 and 124, until all of the relations have the required frameworks constructed.
  • processing continues at block 127.
  • processing returns to the calling program.
  • This procedure of creating the framework for a database is the same as described in the above- identified application entitled "A RELATIONAL DATABASE REPRESENTATION WITH RELATIONAL DATABASE OPERATION CAPABILITY.”
  • FIG. 10 a flow diagram for the INSERT ROW routine is shown.
  • the purpose of the INSERT ROW routine is for converting each row of a load file associated with a relational database into one or more assigned identifiers depending on whether the row is "atomic" or "non-atomic.” If the row is "atomic,” it is treated as an entity having an associated identifier assigned. If the row is "non-atomic,” it must be treated as two or more entities each having an associated identifier assigned.
  • a "framework" for the relational database has been constructed by the CREATE DATABASE routine (FIG. 9) and that the identifiers will have a place to be stored within the framework.
  • the first of the rows from the load file is examined to determine if it is "atomic," i.e., whether more than one identifier is required to represent the row
  • the RPU parses the row into one or more smaller entities. Each smaller entity corresponds to a column of the relation. Processing continues at block 264
  • the RPU determines if the entity is referenced in a domain or a relation. Assuming that the entity is a domain reference, processing continues at block 256. During block 256, the RPU determines if the entity is already in a domain. Assuming that the entity is currently in a domain, processing continues at block 267. However, if the entity is not in a domain, then processing continues at block 258. During block 258, the RPU first inserts the selected entity ' (value) into the domain, and then processing continues at block 267.
  • a number for the entity is returned as an identifier. All previous values stored in the domain have identifiers associated with them. When a new entity is added into the domain, the next available identifier is provided. Typically, the entities are added to the domain in an order by entry method and the identifiers are determined by the order of entry. In this way, the returned identifiers will not change previous identifiers associated with values in the domain. Processing continues at block 262 during which the returned identifier is placed in the appropriate row of the relation.
  • processing goes directly to block 270.
  • the RPU selects an identifier or identifiers by finding a matching entity or entities from the column's domain.
  • the RPU determines whether more than one identifier was returned. If more than one identifier is returned, the processing continues at block 274, during which a null value is inserted in the relational row. The null value is a flag to the user that there is not exactly one referenceable value and thus, a connection cannot be established for that identifier. Processing continues at block 280.
  • processing continues at block 276.
  • the RPU places the returned identifier in the appropriate relational row of the relation.
  • processing continues at block 280.
  • the RPU determines whether there are any more entities left for processing. Assuming that there are still more entities left for processing, then processing continues at block 278 to obtain the next entity from the load file. Processing continues at blocks 268, 256, 258, 267, 262, 270, 272, 274, 276, 280 and 278 until all of the entities for the particular relation are processed. If there are no more entities for processing, then processing continues at block 284.
  • the RPU returns an identifier for the row in which an identifier was placed during blocks 262 and/or 276. During block 285, this row number identifier is also placed in the relational row, completing the INSERT ROW routine. Processing continues at block 286, during which the RPU returns the processing to the BUILD DATABASE routine, block 161 (FIG. 8) .
  • the purpose of this example is to take the prior art relational database as depicted in FIGS. 1(a), 1(b) and 1(c) and convert it into the relational database represented by identifiers as shown in FIGS. 6(a), 6(b) and 6(c).
  • the relational database as shown in FIGS. 1(a), 1(b) and 1(c) is stored in the external device 109 (FIG. 7) .
  • the relational database data is stored in a load file in the order as shown in FIG. 1.
  • the order for loading the relational database into the RDMS 105 (FIG. 7) is determined by the relational database administrator.
  • FIG. 8 is- the overall block diagram which represents control over building the relational " database with identifiers.
  • the CREATE DATABASE routine (FIG. 9) is called.
  • the CREATE DATABASE routine builds the framework for the domains, relations and columns of the relational database shown in FIG. 6.
  • the first step of the CREATE DATABASE routine occurs during block 102, during which the framework for the first domain of the database is created. Following the example, the framework for the Last Name domain is generated first. Also established is the characteristic of the domain including its name, data type, value size, etc.
  • RPU determines if there are other domains which need a framework created. Processing continues at block 102 and the First Name domain framework is generated. Blocks 102 and 104 are repeatedly processed until all of the frameworks for the various domains of the relational database are generated. More particularly, the Salary domain, the Department domain, and the Title domain, have frameworks built.
  • the framework for the first relation of the relational database is constructed.
  • an empty framework for the PERSON table is constructed.
  • the framework for the first relational column of the PERSON table is built, more particularly, the Last Name column has its framework constructed.
  • the RPU includes a reference to the LAST NAME domain in the LAST name framework.
  • the framework for the FIRST name column of the Persons relation is constructed. Processing continues at block 111, during which the RPU determines if the First name column has values which are directly drawn from a domain. The First name column of the PERSON table draws its values directly from the FIRST NAME domain, and thus processing continues at block 116. During block 116, the RPU places a reference in the First name column to the domain for first names. Processing continues at block 120 during which the RPU determines whether there are other columns which need frameworks constructed for the PERSON table. There are still columns which need construction, thus processing continues at block 108.
  • the framework for the Birth date column of the PERSON table is constructed. Processing continues at block 110, during which the RPU determines if ' the Birth column draws its values from a domain. birth column draws its values from the birth domain and thus, processing continues at block 116. During block 116, the RPU places a reference to the Birth domain in the Birth column. Processing continues at block 120, during which the RPU determines that there is still another column left which needs to have its framework constructed. Processing continues at block 108.
  • the framework for the Employee relation is constructed.
  • Processing continues at block 108, during which the framework for the Employee column is built.
  • the Employee column refers back to the PERSON table for the employee's First Name, Last Name and birth Date.
  • the RPU determines that the Employee column draws its values from a relation.
  • processing continues at block 118.
  • RPU includes a reference in the Employee column to the PERSON table from which the column draws its values.
  • processing continues at block 120, during which the RPU determines that there are still more columns in 1 the relation which require construction. Processing continues at block 108.
  • the framework for the Salary column is built. Note that the PERSON table does not have a 5 Salary column.
  • the RPU determines that the Salary column draws its values directly from the Salary domain. Thus, processing continues at block 116. During block 116, the RPU places a reference to the Salary domain in the Salary column. Processing continues at
  • the framework for the Department column is constructed.
  • the RPU 15 determines that the Department column draws its values from the Department domain.
  • the RPU includes a reference to the Department domain .in the Department column. Processing continues at block 120, during which the RPU determines that there are no more
  • Block 124 is entered and the RPU determines that there are still other relations to be built for the relational database.
  • the MANAGER table relation is constructed. Processing continues at blocks 106, 108,
  • the RPU at block 124 determines that there are no more relations to be constructed for this relational database.
  • processing returns to the calling program or the BUILD DATABASE routine (FIG. 8) at block 154.
  • the RPU selects the Persons relation to be loaded into the RDMS 105. Recall that the Persons relation was strategically positioned to be input into the RDMS first. In this manner, identifiers can later be referenced. Processing continues at block 155 during which the RPU obtains the first row from the load file containing the Persons relation. Once again, the Persons relation depicted in FIG. 1(a) represents the load file and the order in which the rows are to be loaded into the RDMS. Then, during block 152, the INSERT ROW routine (FIG. 10) is called to convert the row from the load file into the identifier-represented row, as depicted at 18(a) " of FIG. 6(a) . Referring to FIG.
  • the RPU determines if the row or entity currently loaded is atomic, i.e., can be identified as a single entity.
  • the row loaded is row 18(a) of FIG. 1(a); Jones, Sam, 9/9/30. This row is non- atomic and thus, during block 264, it is parsed into three smaller entities which are atomic.
  • the first atomic entity is for Jones, and during block 266 it is obtained.
  • the RPU determines if Jones is referenced by any domain. Currently, all of the domains are empty. However, the CREATE DATABASE routine (FIG. 9) has constructed their frameworks.
  • the RPU determines that the value for Jones is not located in the Last Name domain. Thus, processing continues at block 258, during which the value for Jones is inserted into the Last Name domain as shown in the RESULTS table (FIG. 11(b)) at line 412.
  • the RESULTS table depicted in FIG. 11 is divided into two portions.
  • the left hand side of the table depicting the construction of the Persons relation represented in the form of identifiers is broken up into six columns: Identifier (ID#) , Person's Last Name, Person's First Name, birth Date, Father and Mother.
  • Identifier (ID#) the domains related to the PERSON table are shown, in four columns: Identifier (ID#) , Last Name, First Name, and birth Date.
  • the Identifier "01" is shown associated with Jones at 412 in FIG. 11(a). This Identifier is returned to the Persons relation and at block 262 it is placed in the first row of the Last Name column of the Persons relation (352, FIG. 11(a)).
  • the RPU determines whether Sam references a domain or a relation.
  • the value Sam is in a domain and thus, during block 256 the RPU determines whether the value Sam is already contained in the First
  • the RPU determines that the date, 9/9/30, references a domain, namely, the birth domain. Processing continues at block 256, during which it is determined that the date 9/9/30 is not contained in the birth domain. During block 258, the date 9/9/30 is inserted into the Birth domain, as shown at 420 of FIG. 11(a) . This Birth date is entered in the first row of the Birth domain and thus, is associated with the identifier "01". During block 267, the identifier is returned, and during block 262, the identifier "01" is placed in the first row of the Birth column of the PERSON table (360, FIG. 11(a)).
  • the RPU determines that there are no more entities left for processing. Processing continues at blocks 284 and 285 during which the identifier associated with the inserted row, for "Jones, Sam, 9/9/30" is returned and placed in the Persons relation table (362, FIG. 11(a)). The newly inserted row for "Jones, Sam, 9/9/30” is the first inserted row of the Persons relation and, thus, the identifier for the row is "01" (362, FIG. 11(a)). Processing continues at block 286, in which processing returns to the BUILD DATABASE routine (FIG. 8) at block 161. During block 161 (FIG.
  • the RPU determines whether there are any more rows in the Persons relation left for processing. There are still currently more rows in the Persons relation as shown in FIG. 1 and, thus, processing continues at block 155.
  • the next row from the load file (FIG. 1(a)) is obtained and then during block 152 the INSERT ROW routine (FIG. 10) is called again. The above process is repeated for Mary Smith, and
  • the RPU determines if there are any more rows in the Persons relation which need to be processed. There are many more rows remaining in the load file for the PERSON table and, thus, processing continues at blocks 155 and 152.
  • the next five input rows of the Persons relation (18(d), 18(e), 18(f), 18(g), 18(h) of FIG. 1(a)) follow the strategy as discussed earlier for the first three input rows (18(a), 18(b), 18(c) of FIG. 1(a)) which was discussed above. Referring to entries 392 and 454 of FIG. 11(b), the. Persons relation having its identifiers in the various domains are shown after rows 18(a) through 18(h) are loaded into the RDMS.
  • rows 18(e) , 18(f), 18(g), 18(h) include at least one entity that is already in a domain. Accordingly, the identifiers in the corresponding rows in the PERSON table, as indicated at 292, are not all the same value.
  • row 18(i) of FIG. 1(a) The next row of the PERSON table to be loaded is row 18(i) of FIG. 1(a).
  • row 18(i) "Jones, John, 3/5/62; Jones, Sam, 9/9/30; Smith, Mary, 9/14/39" is obtained from the load file.
  • Processing continues at block 152, during which the INSERT ROW routine (FIG. 10) is again called.
  • the RPU determines if the newly obtained row is atomic.
  • the row is not atomic and, thus, during block 264, the row is parsed into five atomic entities, mainly, Jones, John, 3/5/62; Jones, Sam, 9/9/30; and Smith, Mary, 3/14/39. Jones, Sam, 9/9/30 and Smith, Mary, 3/14/39 are parsed as two atomic entities because only two columns are specified namely, Father and Mother of the PERSONS table, and, by definition are entities, already occurring in prior rows of the table.
  • the first entity, "Jones” is obtained during block 266. Then, during block 268, the RPU determines that "Jones" references a domain.
  • processing continues at block 267, during which the identifier "01" associated with Jones is returned and during block 262 the identifier is placed into the ninth row of the Last Name column of the PERSON table (394, FIG. 11(b)).
  • processing continues at block 280, during which the RPU determines whether there are any more entities left for processing. The next entity, "John” is obtained during block 278 and processing continues at block 268.
  • RPU determines that "John” references the First Name domain.
  • 256 it is determined that "John” is not in the First Name domain.
  • "John” is inserted into the next row of the First Name domain which is row 8, FIG. 11(c).
  • the identifier "08" is returned and during block 262, the identifier is placed into the ninth row of the First Name column of the PERSON table (398, FIG. 11(c)).
  • Processing continues at block 28.0, during which the RPU determines if there are any more entities left for processing.
  • date "3/5/62" is obtained and processing continues at block 268.
  • processing continues at block 268. 5 During block 268, the RPU determines whether the entity, "Jones, Sam, 9/9/30" references a domain or a relation. This entity references an entry in a prior row of the PERSON table, namely, Sam Jones at row 18(a) of FIG. 1(a). Thus, processing continues at block 270, 0 during which the identifier associated with the entity "Jones, Sam, 9/9/30" is obtained. The identifier associated with that entity is "01", as indicated in the ID# column. See 362 of FIG. 11(a).
  • the RPU determines that there is exactly one identifier 5 returned and, thus, processing continues at block 276 which places the identifier into the Father column of the PERSON table (404, FIG. 11(d)). Processing continues at block 280, during which the RPU determines that there is still one more entit " left in the input row, namely, 0 "Smith, Mary, 3/14/39.” This entity is obtained at block 278. Processing continues at block 268.
  • Processing continues at block 280 during which the RPU determines that there are no more entities for processing 5 and thus processing continues at blocks 284 and 285.
  • the row number associated with the newly inserted row is placed into the Identifier column of the Persons relation.
  • the newly inserted row is the ninth row of the PERSON table and, thus, the identifier "09" is added to the Persons relation (408, FIG. 11(d)).
  • Processing returns to the calling program, BUILD DATABASE routine (FIG. 8) , at block 161.
  • the RPU determines whether there are any more relations to load. There are still two more relations left for loading and they are the Employee relation and the Manager relation. Processing continues at block 154, during which the Employee relation is selected as the next relation for loading. Then, in block 155, the first row of the EMPLOYEE table as shown in FIG. 1(b) is obtained. The first row of the EMPLOYEE table (FIG. 1(b)) is "Jones, John, 3/5/62, $31,000.00, Marketing.” Then, in block 152, the INSERT ROW routine (FIG. 10) is called to process this row of the load file, as indicated at 474-484 and 514-522 of FIG. 12.
  • the INSERT ROW routine is repeated for the three additional employees of the Employee Table, as indicated at 486-512 and 524-550 of FIG. 12.
  • processing continues at block 280, during which RPU determines that there are no more entities left for processing in the last inserted row.
  • the row number identified with the newly inserted row is determined to be "04" and it is placed into the fourth row of the Identifier column of the Employee relation (512, FIG. 12) .
  • processing returns to the BUILD DATABASE routine (FIG. 8) at block 161.
  • the RPU determines that there are no more rows in the Employee relation to be inserted and, thus, processing continues at block 165.
  • the RPU determines that there is still one more relation, namely, the Manager relation, which needs to be loaded.
  • Processing continues at block 154 during which the Manager relation is obtained.
  • the first row of the Manager relation "Smith, Mary, VP, Security” is obtained from the load file.
  • the INSERT ROW routine (FIG. 10) is called.
  • the input row is examined and it is determined that it is not atomic.
  • the RPU determines that there are other entities which need to be processed in the input row and processing continues at block 278, during which the entity "VP" is obtained. Processing returns to block 268 during which it is determined that the entity references a domain. At block 256, it is determined that the entity is not located in the Title domain and, thus, during block 258 it is inserted into the first row of the Title domain (572, FIG. 13) . During block 267, the identifier associated with "VP" is determined to be "01" and during block 262 it is placed into the first row of the Title column in the Manager relation (556, FIG. 13) . Processing continues at block 280, during which the RPU determines that there is one more entity left for processing.
  • the entity "Security” is obtained and processing continues at block 268.
  • block 168 it is determined that the entity references a domain and in block 256 it is determined to be in the second row of the Department domain (574, FIG. 13) .
  • the identifier associated with "Security” is determined to be "02" and during block 262 the identifier is placed in the first row of the Department column of the Manager relation (555, FIG. 13) .
  • Processing continues at block 280 during which the RPU determines that there are no more entities left to process for this particular inserted row.
  • the row number "01" of the newly inserted row is placed in the Identifier (ID#) column of the Manager relation (560, FIG. 13) .
  • processing returns to the BUILD DATABASE routine (FIG. 8) at block 161.
  • the RPU determines that there is still one more row in the Manager relation to be added and, thus, processing continues at block 155 and 152 to insert the row "Doe, John, VP, Marketing" into . its identifier form in the Manager relation.
  • the row is obtained from the load file and in block 152 the INSERT ROW routine (FIG. 10) is called.
  • the row is not atomic.
  • the row is parsed up into the following entities, "Doe, John", “VP”, "Marketing”.
  • the entity "Doe, John” is obtained and during 268 it is determined that it references a relation.
  • the identifier associated with the entity is determined to be "02" and at block 272, it is confirmed that only one identifier is returned.
  • the identifier "02" is placed into the second row of the Manager column of the Manager relation (562, FIG. 13).
  • the RPU determines that there are still entities left for processing and, thus, in block 278, the entity "VP" is obtained.
  • Processing continues at block 268, during which the entity is determined to reference a domain.
  • the entity is determined to be in the first row of the Title domain (580, FIG. 13).
  • the entity "Marketing" at block 278 is obtained and during block 268 it is determined to reference a domain.
  • block 256 it is determined that the entity is in the first row of the Department domain and has an identifier of "01."
  • the identifier is returned. In block 262, it is placed into the second row of the Department column of the Manager relation (566, FIG. 13) .
  • the row number associated with the newly inserted row is determined to be "02." This number is placed into the second row of the Identifier column of the Manager relation (568, FIG. 13).
  • the entire relations shown in FIGS. 1(a), 1(b) and 1(c) have been loaded into the RDMS (105, FIG. 7) and has been converted into a relational database represented by identifiers as shown in FIG. 6 and at 410 of FIG. 11 (PERSON table) , 512 of FIG. 12 (EMPLOYEE table) and 568 of FIG. 13 (MANAGER table) .

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Système de gestion de bases de données relationnelles dans lequel des entités stockées dans la base de données sont remplacées dans les relations avec les identificateurs codés (270, 285). On se trouve confronté à des problèmes dans la gestion de bases de données relationnelles lors du changement de données correctes pour certaines entités mais incorrectes pour d'autres. La présente invention résout le problème d'intégrité référentielle en étendant l'utilisation d'identificateurs de sorte que chaque entité relationnelle d'une relation peut faire référence à une autre entité relationnelle (ou ligne) de la même relation ou d'une relation différente. On peut remplacer un groupe d'entités connexes dans la file d'une relation, par un identificateur dans toutes les autres relations dans lesquelles les mêmes entités sont groupées dans une file. Les identificateurs remplacent les entités dans toutes les opérations de bases de données relationnelles.
PCT/US1988/003477 1987-10-09 1988-10-06 Base de donnees relationnelles utilisant des identificateurs Ceased WO1989003567A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US10743587A 1987-10-09 1987-10-09
US107,435 1987-10-09
US23770288A 1988-08-26 1988-08-26
US237,702 1988-08-26

Publications (1)

Publication Number Publication Date
WO1989003567A1 true WO1989003567A1 (fr) 1989-04-20

Family

ID=26804779

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1988/003477 Ceased WO1989003567A1 (fr) 1987-10-09 1988-10-06 Base de donnees relationnelles utilisant des identificateurs

Country Status (2)

Country Link
AU (1) AU2722388A (fr)
WO (1) WO1989003567A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997015015A3 (fr) * 1995-10-13 1997-05-15 Annette Brueckner Systeme d'informations et procede de memorisation de donnees dans un systeme d'informations

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE68926422T2 (de) * 1988-09-23 1996-11-07 Ibm Datenbankverwaltungssystem

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4631673A (en) * 1985-01-22 1986-12-23 International Business Machines Corporation Method for refreshing multicolumn tables in a relational data base using minimal information

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4631673A (en) * 1985-01-22 1986-12-23 International Business Machines Corporation Method for refreshing multicolumn tables in a relational data base using minimal information

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
C.J. DATE, "An Introduction to Database Systems", published 1981 by Addison-Wesley Publishing Co., see pages 3-143, 159-169, and 498-503, especially pages 114, 115, 140 and 141. *
G. EVEREST, "Database Management: Objectives, System Functions, and Administration", published 1986 by McGraw-Hill Book Co., (New York), see pages 120-149, especially pages 124, 128, 129, 131, 136 and 141. *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997015015A3 (fr) * 1995-10-13 1997-05-15 Annette Brueckner Systeme d'informations et procede de memorisation de donnees dans un systeme d'informations

Also Published As

Publication number Publication date
AU2722388A (en) 1989-05-02

Similar Documents

Publication Publication Date Title
US4933848A (en) Method for enforcing referential constraints in a database management system
US6374252B1 (en) Modeling of object-oriented database structures, translation to relational database structures, and dynamic searches thereon
EP0722591B1 (fr) Procede et appareil de stockage et d'extraction de donnees
US6564212B2 (en) Method of processing queries in a database system, and database system and software product for implementing such method
US6212524B1 (en) Method and apparatus for creating and populating a datamart
US6711563B1 (en) Methods of organizing data and processing queries in a database system, and database system and software product for implementing such methods
US6014670A (en) Apparatus and method for performing data transformations in data warehousing
EP0633538A2 (fr) Système de gestion de base de données relationnel
US7739224B1 (en) Method and system for creating a well-formed database using semantic definitions
US20020095421A1 (en) Methods of organizing data and processing queries in a database system, and database system and software product for implementing such methods
US20020194196A1 (en) Method and apparatus for transforming data
US20120158793A1 (en) Automatic Elimination Of Functional Dependencies Between Columns
US20020093522A1 (en) Methods of encoding and combining integer lists in a computer system, and computer software product for implementing such methods
US20050216497A1 (en) Uniform financial reporting system interface utilizing staging tables having a standardized structure
EP0857334A1 (fr) Systeme de diffusion et d'archivage d'informations sur l'entreprise
WO2002047463A2 (fr) Procede et appareil de transformation de donnees
EP1672540A1 (fr) Rayonnage muni de casiers de picking destiné à la préparation automatisée des commandes
US7617206B1 (en) Method for analyzing status of specialized tank files which store and handle large objects
Plew et al. Sams teach yourself SQL in 24 hours
WO1989003567A1 (fr) Base de donnees relationnelles utilisant des identificateurs
US20030055838A1 (en) Data storing method and data storing structure
US20060074982A1 (en) Method for comparing tabular data
Maheshwari et al. Introduction to SQL and PL/SQL
Smith A Survey of Relational Database Management Systems of Microcomputers
Stephens et al. MySQL Column and Table Types

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AU JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE FR GB IT LU NL SE