GB2418748A - Directory structures for composite data files - Google Patents
Directory structures for composite data files Download PDFInfo
- Publication number
- GB2418748A GB2418748A GB0421636A GB0421636A GB2418748A GB 2418748 A GB2418748 A GB 2418748A GB 0421636 A GB0421636 A GB 0421636A GB 0421636 A GB0421636 A GB 0421636A GB 2418748 A GB2418748 A GB 2418748A
- Authority
- GB
- United Kingdom
- Prior art keywords
- file
- sub
- predetermined
- substructure
- function
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/10—Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/10—Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
- G06F21/12—Protecting executable software
- G06F21/14—Protecting executable software against software analysis or reverse engineering, e.g. by obfuscation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Technology Law (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Bioethics (AREA)
- General Health & Medical Sciences (AREA)
- Storage Device Security (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Hindering piracy reverse-engineering and ripping of software by creating a directory structure (80) comprising a plurality of substructures (100), for use in accessing selected sub files (30) from within a composite data file, (70) each sub file (30) having an associated filename. The invention comprises associating an unique substructure (100) with each sub file (30) by applying a first predetermined function to the associated filename and using the result as an index into the plurality of substructures (100), and applying a second predetermined function to the associated filename of each sub file (30) and storing the result in a first portion of the substructured (100) associated with the sub file (30). Checksum and hashing techniques may be used for verification and encryption, allowing only legitimate access to sub files (30).
Description
- 24 18748 - 1 Directory structures for composite data files The present
invention relates to directory structures and associated composite data files, and in particular, but not exclusively, to the use of such directory structures and associated composite data files to enhance copy and rip protection of commercial software packages, especially games software and the like.
Background of the invention
Modern software applications, designed for execution on computers situated, for example, at home or in an office, usually comprise a large number of different files. Such software applications can be for business use, such as spreadsheet software, word processing software or dictation software, or they can be for other purposes, such as entertainment, for example computer video games.
Whatever the application, a common way to distribute such software is in the form of one or more computer readable media containing the various files required by the application. Typically, the media will be some form of optical storage medium, such as a CD-ROM, CD-R, DVD-ROM, DVD-R and the like. However it may equally be some other form of storage such as magnetic disks or static memory, or the files may even be sent over a network connection.
The choice of storage medium typically depends on the combined size of the application files. The trend nowadays is towards distribution on DVD based media, for the main reason that such media are relatively cheap, yet can contain larger quantities of data than CD-ROMs, as used previously.
The application files can either be loaded directly from the computer readable media used to distribute the - 2 software to end users, else it can be installed on to a computer prior to use.
When installed prior to use, the application files are typically stored on to magnetic disks, or hard drives contained within the computer on which the application will run. These hard disks typically have lower access times and faster transfer rates than the optical media used to distribute the application, and therefore application execution times are improved by carrying out the installation prior to use. However, in some computer systems, such as home consoles, there is no installation procedure, and the application often runs directly from the distribution media.
Typically, a software application may contain a primary application file which is the application executable, or main application programme. This is usually the first file to be loaded into the target computer's random access memory by the loading procedure, although it could also be loaded by a preliminary or booting application. The primary application executable usually controls the subsequent loading of all the ancillary files necessary for the correct operation of the programme as a whole.
The ancillary files, or data sub files, loaded by the main executable contain further data required by the main executable. These sub files may contain, for example, additional executable code, map data, computer model data, audio samples, music, textures or configuration information required to configure hardware or software components, ready for use by the main application executable.
Finally, there are often a number of supplementary files included that are not strictly necessary to the running of the main application executable. These either - 3 provide useful or interesting information to the user, for example, help in how to use the main executable, or serve some other useful purpose, such as providing access to third party software tools necessary to use these supplementary files or may be required for legal reasons e.g. licence files or disclaimers.
In the case of many software applications, but most particularly in the case of computer games, the data required by the game when running is often compacted into a single or smaller number of composite data files. Examples of such systems include WAD and PAN type files. Numerous Patents have been published on the construction of such file systems, as well as efficient database access methods, for example see US Patents 5983239, 5218696 and 4611272.
One problem faced by the software industry is that of software piracy. As a result of piracy, it has become necessary to find ways of protecting software applications, including games, from the attentions of those who seek to profit by illegally reproducing and selling copies of these products.
There are a number of ways used to illegally copy, or pirate, software, and we will outline three in particular.
The first is one-to-one copying. This simply involves taking an exact digital copy of a software program and burning it on to a recordable media, for example, a DVD-R optical disc.
The second exploit is reverse engineering. This usually involves decompiling the program code to enable the removal of software copy protection measures, such as trigger functions which would otherwise fire if a copied disc were to be detected, or integrity checks such as checksums on data files loaded by the main executable file. Another area targeted is that of bonus codes, which are purchased - 4 separately by end users and entered to unlock bonus features of the game. These are often removed from the original main executable code and released as a standalone bonus code generator.
The third exploit is 'ripping'. In this exploit, data deemed to be Unessential to playing the game, such as full motion video cut scenes are removed from the disc, which is then reburnt (sometimes to a CD-R, if enough material can be removed) or placed on an internet website, frequently referred to as a warez site, for download.
Over the years, there have been numerous systems developed that try to tackle the problem of software piracy.
Most systems developed so far have been directed towards the prevention of the first of the above exploits from working, i.e. one-to-one copying. These previously known methods to prevent one-to-one copying usually focus on tying the application software to a physical property of a legitimate disc. This physical property could be, for example, information on a secret sub channel located on the disc, variations in the structure of the glass master used to produce a disc, or a special pattern of unreadable sectors on a disc. However, these systems can be overcome by either advanced burning software capable of, for example, reading and writing to secret sub channels, else by utilising the reverse engineering exploit mentioned above to remove these protection systems, or at least circumvent them.
Summary of the Invention
The invention addresses these and other problems and
limitations of the related prior art. - 5 -
Generally, the invention provides a method of creating a data structure comprising a plurality of substructures, each substructure having an associated reference or label, the method comprising associating an unique substructure with each reference by applying a first predetermined function to the associated reference and using the result as an index into the plurality of substructures; and applying a second predetermined function to each reference and storing the result in the substructure associated with the reference.
The data structure could be or form part of a database, with further data to be associated with each reference being stored in the associated substructures. Alternatively, the data structure could be a directory to facilitate access to other data structures or files which are known by means of the references or labels.
More particularly, the invention provides a method of creating a directory structure comprising a plurality of substructures, for use in accessing selected sub files from within a composite data file, each sub file having an unique associated filename (which includes the fully qualified subdirectory path), the method comprising associating an unique substructure with each sub file by applying a first predetermined function to the associated filename and using the result as an index into the plurality of substructures and applying a second predetermined function to the filename of each sub file and storing the result in a first portion of the substructure associated with the sub file.
Preferably, the directory structure is used by a predetermined executable to access selected sub files from within a composite data file, and the method further comprises the step of encrypting a second portion of each 6 - substructure using an encryption key derived from the predetermined executable using a third predetermined function.
In essence the invention provides a method of packing a large number of the data sub files required by a main executable into one or more composite data files, which are then accessed via an encrypted directory file by the main executable programme. The encryption of the directory file is dependent upon a checksum of the executable which seeks access to the data sub files, and, therefore, if this executable is tampered with in any way, this will make the data sub files inaccessible.
Furthermore, additional checks may be included in the executable which test the integrity of the data sub files as they are accessed, so that embodiments can incorporate a degree of cyclical integrity. These checks could comprise a fourth predetermined function being applied to a subtile in order to generate a checksum of the file contents.
Of course, the filename associated with each data sub file may be a string of characters, a number, or any other label that can uniquely identify an individual data sub file. The filename may include a three character extension which may also be used by the predetermined functions, but equally the extension might not be present, or if it is, ignored when applying the predetermined functions to the filename.
The method is adapted to improve the resiliency of software against copying and dissemination to users who do not have the appropriate permissions or licenses to operate the software, and in particular is adapted to prevent copying of computer game software. - 7 -
Preferably, the first, second, third and fourth predetermined functions include hash functions. Preferably, the first predetermined function is a division hash function, the second predetermined function is a Fibonacci hash function and the third and fourth predetermined hash functions are message digest hash functions, for example MD5, SHA or RIPEMD, for providing an unique value derived from an input.
Preferably, the first predetermined function further includes a first double offset function for use when the first hash function does not produce an unique indexed substructure index for the associated sub file.
Preferably, the method further includes the step of concatenating together the sub files into at least one composite data file in accordance with the directory structure. The directory structure created by the method is capable of providing access to sub files located in multiple composite data files.
Preferably, the method further includes the step of encrypting a second portion of each substructure using a block encryption algorithm to encrypt.
Preferably, each sub file has a file size, and the method further includes the step of determining the size of each sub file and storing the result in the second portion of the substructure associated with the sub file.
Preferably, each sub file has a position offset from the start of the composite data file and the method further comprises the step of storing the position offset of each sub file within a composite data file in the second portion of the substructure associated with the sub file.
Preferably, the sub file may be in any one of a plurality of composite data files, the method further 8 - comprising the step of storing, in the first portion of each substructure, the number of the composite data file containing the associated sub file.
Preferably, the sub file may be included in at least one composite data file more than once, and the method further includes the step of storing the instance number of the associated sub file. The inclusion of more than a single instance of a particular data file may be used to provide improved sub file access performance, for example, by supplying access information particular to an instance of the required file nearest the current read location of the equipment executing the application executable.
Preferably, the first portion of each indexed substructure further includes the number of instances of the associated sub file.
Preferably, the method further includes the step of adding padding data to the concatenated together sub files to create composite data files of a certain size.
The invention also provides a directory structure for a computer, for use by a predetermined executable to access a selected sub file from within a composite data file, each sub file having an associated filename, comprising a plurality of substructures, each substructure being associated with a particular sub file by a result of applying a first predetermined function to the filename associated with the sub file, each substructure including a first portion containing the result of applying a second predetermined function to the filename of the associated sub file, and a second portion encrypted using a key derived from applying a third predetermined function to the predetermined executable. 9 -
The invention still further provides a computer readable medium, comprising at least one composite data file and a directory structure according to the method provided.
The invention still further provides a method of enabling a predetermined executable to access a sub file located within a composite data file, which includes a plurality of sub files, each sub file having an associated filename, the method comprising using the result of a first predetermined function applied to a filename associated with a sub file to be accessed to identify a substructure indexed by the resultant value, applying a second predetermined function to the filename associated with the sub file being accessed, and comparing the resultant value with a value stored in a first portion of the substructure indexed by the result of first step, decrypting a second portion of the said substructure using a key derived from applying a third predetermined function to the predetermined executable, when the two values compared in the second step are equal and loading the sub file using information contained within the decrypted second portion of the substructure associated with the sub file being accessed.
Preferably, the method further comprises the step of applying a predetermined offset function to the filename of the sub file being accessed when the two values compared in the second step are not equal.
Preferably, the method further comprises the step of comparing a file size of the sub file stored in the second portion of the associated substructure with a predetermined file size, and checking the integrity of the sub file if the file size of the sub file is found to be below the predetermined filesize. - 10
The invention further provides apparatus for building a directory structure comprising a plurality of substructures, for use by a predetermined executable to access selected sub files from within a composite data file, each sub file having an associated filename, the apparatus comprising means for applying a first predetermined function to a filename associated with a sub file and using the result as an index into the directory substructures, means for applying a second predetermined function to a filename associated with a sub file and storing the result in a first portion of each associated substructure, and means for applying a third predetermined function to the predetermined executable and encrypting a second portion of each substructure dependent on the result of applying the third predetermined function to the predetermined executable.
The invention also provides a process to produce a circular closure giving rise to mutual dependency between the executable and the data files required by the executable, by performing data integrity checks and by making the decryption key required to decrypt the directory information used to locate the data sub file depend on the checksum of the executable. In particular, there is provided a method of linking a predetermined executable program to a plurality of sub files to be accessible by the executable during execution by a computer, the method comprising concatenating the plurality of sub files into a composite data file, creating a directory structure for accessing the individual sub files within the composite data file, wherein the directory structure includes a plurality of substructures, each substructure being uniquely associated with a particular sub file, storing information necessary to locate a sub file within the composite data file in the - 11 - substructure associated with each sub file, deriving an encryption key from the predetermined executable and encrypting at least a portion of the directory substructure containing said information using the encryption key, to thereby restrict access to the sub file.
There is also provided a method and apparatus for referencing data sub files required by an executable, during its execution, by twin hashing the filename and using the hash results to reference the data sub files, to thereby allow access to the data sub files without exposing the filenames of the data sub files.
There is still further provided a method of checking that a file being accessed by an executable has not been tampered with prior to execution by comparing a checksum of the data sub file created at the authoring stage of the software production process with a checksum calculated from the data sub file actually loaded, and setting a flag marker if the checksums are found to be not equal.
Preferably, the executable checks the status of the flag markers set during integrity checks, and provides means to allow the cessation of the execution of the main executable program as a result of the status of the flag markers.
The present invention aims to make the 'ripping' exploit virtually impossible, and to seriously hinder the reverse-engineering exploit. Due to the fact that the scheme hinges on whether internal consistency is maintained, straightforward one-to-one copying is still possible.
However the present invention can easily be combined with a suitable physical disc property detection method to make a more robust and less assailable system.
Brief Description of Drawings
The present invention may be put into practice in a number of ways. Some embodiments will now be described, by way of example only, and with reference to the accompanying drawings, in which: Figure 1 shows principal relationships between an application executable program, a composite data file and an associated encrypted directory structure according to the present invention; Figure 2 is a top level schematic of various processes involved in producing the data structures of figure 1, the inputs needed to form these data structures, and how they relate when in used Figure 3 presents a more detailed view of the process of creating the composite data file and encrypted directory file of figure 2; Figure 4 shows one preferred embodiment of an encrypted directory indexed substructure of the present invention) Figure 5 shows a typical process used when executing an application executable that makes use of the present invention; Figure 6 is a detailed flowchart of a file access process making use of the data structures constituted according to figure 3; and Figure 7 shows an integrity check process.
Detailed Description of the Preferred embodiment
Figure 1 illustrates data structures and their relationships according to a first embodiment of the present invention. A directory structure 80 comprises a plurality of substructures 100. A composite data file 70 comprises a plurality of sub files 30 concatenated together. A predetermined executable program 50 accesses the data sub files 30 within the composite data file 70 by using the directory structure 80. A build tool 60 is provided to create both the directory structure 80 and the composite data file 70. In the preferred embodiment, a single directory structure 80 may control access to several composite data files 70.
Typically, the directory structure 80, the composite data file 70 and the executable 50 are stored together on a single software distribution medium, such as an optical disk. The build tool 60 is used by the software provider to construct the various files before writing on to such optical disks.
The directory structure 80 is created by the build tool by associating an unique substructure 100 with each sub file. This is achieved by applying a first predetermined function to a filename of each sub file 30 in order to produce an index into the directory structure 80. If necessary, where the first predetermined function does not produce an unique index from the filename in the first iteration, then a second iteration, using an amended first predetermined function is invoked. This is usually in the form of a double offset function, as is known in the art.
Once an unique substructure 100 has been allocated to each sub file within the composite data file 70, the build tool then applies a second predetermined function to the file name of each sub file and stores the result in a first portion of the substructure 100 associated with the sub file 30.
A third predetermined function is applied to the predetermined executable 50 in order to create a cryptographic key. This key is then used to encrypt the second portion of each substructure 100.
In other preferred embodiments, information relating to the size of a sub file 30 and its position offset from the start of the composite data file 70 is written to a second portion of each associated substructure. This information is then used by the predetermined executable to access, or load, a sub file 30 from within the composite data file 70.
Figure 2 is a top level schematic of a more detailed second embodiment of the present invention, aspects of which may be combined as appropriate with the more general description of the first embodiment set out above. The overall scheme is split into three stages. The first stage includes the conventional compilation 20 of the files necessary to execute the intended software application or game. These files might typically include an application executable 50, a layout file 40 and the numerous data sub files 30 required by the application executable 50, but of course there can be variations on this, for example, multiple executables 50. Compilation 20 of the application executable 50 and its associated data sub files 30, and the layout of those data sub files 40 across the computer readable medium 90 is carried out as is known in the art.
The resultant application executable 50, data sub files 30 and layout file 40 then serve as the primary input files to the build tool 60, which creates the composite data files and encrypted directory file.
The application executable may be written to the storage medium 90 in a native or unprotected form, but is also required by the subsequent build process of the second stage. -
The second stage of the process includes the building 200, for example using the build tool 60 of figure 1 or 2, of the composite data file 70 and directory structures 80, ready for storage on the computer readable storage medium 90, for delivery of the application to a user computer 10.
This build process 200 requires at least selected information about the application executable 50, in order to encrypt the directory structure 80 using the selected information, thereby locking access to the composite data file to that application executable 80 only.
In the third stage of the process, the software medium is loaded into the user computer 10, and the application executable 50 is executed 300. During execution, access to the sub files 30, contained within the composite data file 70, is obtained through the encrypted directory structure 80 by decrypting the directory structure 80 using selected information derived from the application executable 50 attempting to access the sub file 30. If the application executable 50 is the same as the original application executable 50 used to encrypt the directory structure 80, i.e. no "hacking" has been carried out on this application executable 50, then the information required to locate the sub file within the composite data file 70 can be decrypted correctly, and used to access the desired sub file 30.
However, if the application executable 50 has been altered, the directory structure 80 cannot be decrypted correctly, and therefore the application executable 50 fails to obtain the information required to access the desired sub file 30, and therefore the sub file fails to load.
Figure 3 shows, at a high level, the build process 200 used to create the composite data files 70 and encrypted directory file 80.
In the preferred embodiment, the numerous data sub files 30 required by the application executable 50 are inputted to the build tool 60, and compacted into one or more much larger composite data files 70 which are then accessed by the application executable 50 by reference to an encrypted directory file 80.
Figure 4 shows a typical encrypted directory file entry 100. In the preferred embodiment, the sub structure of each encrypted directory entry 100 is 16 bytes long.
The encrypted directory file 80 typically comprises a number of entries 100, with there being one entry per original data sub file 30, plus some spare, unused entries. Each encrypted directory entry 100 typically contains a number of fields of information. These fields contain information including the size 160 and position 170 of the data sub file within the composite data file 70, a checksum 150 of the data sub file, a composite date file number 120, which uniquely identifies a composite data file, and a data sub file instance 130 and number of instances 140.
The encrypted directory entries 100 are positioned according to a hashing procedure. Due to the nature of hashing algorithms, the encrypted directory file 80 will also contain some unused entries.
In the preferred embodiment, provision is made for the existence of up to 256 composite data files 70 (using 8 bits) with up to 16 instances of each file (using 4 bits), although this can easily be extended at the expense of increasing the size of the encrypted directory structure 80.
Information regarding the size of the data sub file 30, its offset position inside the composite data file 70, and its checksum are all encrypted in order to make it difficult to modify the encrypted directory file 80.
File access within the system according to the invention is made via a twin hashing procedure (which is distinguished from double hashing). In this procedure, the filename of each constituent data sub file 30 is hashed using two separate algorithms, one based on division hashing of the form: h1(x) = (x + k) mod P where x is a number derived from the filename (by iteratively adding together the character values shifted by their position in the string), k is a constant seed value, and P is a prime number which represents the maximum number of files which can be handled by the composite data file system. In the preferred embodiment, P is 2017, and the encrypted directory file 80 is 32 kB in size. This equates to 2048 multiplied by the 16 bytes allocated to each encrypted directory file entry 100. The directory file size of 32 kB is often advantageous as it can be naturally aligned along a power of 2 boundary in memory, which facilitates rapid access e.g. from a memory cache. However, this value can easily be increased in order to handle larger numbers of data sub files 30. Note there is a small amount of "wasted" space, equating to 31 multiplied by 16 bytes, in the preferred embodiment.
Accessing a hashed structure of this form is only efficient provided thehash table is less than 80% full, and hence the practical limit for the system where the encrypted directory file 80 has 2017 entries is 1614 files. Collisions which occur during hashing are handled using a simple double hashing algorithm: d(x) = (8 - (h1(x) mod 8)) as is widely known in the art.
The second part of the twin hashing procedure uses a value derived from the same filename string in a similar manner to above, value x', together with a Fibonacci Hashing algorithm to calculate a hash value of the form: h2(x') = (M/W) * ((a * (x' + k')) mod W) where M is a power of two, W = 2n and n is the word size of the computer, k' is a constant seed value, and a is a constant as close as possible to the value ±1W, where is the Golden Mean = (1+45)/2, therefore resulting in 2654435769 for the case where the computer word size, n, is 32, as is common in modern computer systems.
This second hash value, h2(x'), is the one which is stored in the encrypted directory entry 100 and is searched for.
A key feature of this system is that the file name string itself is not stored in the encrypted directory file 80. This avoids a weakness that other systems have, which can be exploited by hostile parties in order to rip out content deemed Unessential from the data files. This is to say, in the system of the present invention, the fact that the filenames are not stored in the encrypted directory file or composite data files 70, but can only be accessed via the hash functions makes it almost impossible to "reverse- engineer" the composite data file and split it up into its component parts. This is a consequence of the one way nature of the hash function.
Returning to figure 3, the data sub files 30 are concatenated together by a separate composite data file and encrypted directory file build tool 60 which generates the encrypted directory file 80, as well as the composite data files 70 themselves. - 19
The layout of the composite data files 70 is determined by a separate layout file 40, which contains information such as whether a particular data sub file 30 should be included based on application version. For example, different files are required for different regions of the world, e.g. NTSC video files for US versions vs. PAL files for British versions. There is also provision to include multiple copies of the data sub file 30, which allows for loading optimisations to be made via minimization of disc 90 seek times. In addition it can be arranged that files are aligned to block boundaries on the DVD (32kB) using padding as a further loading optimization. Placement of sub files within the composite data file can be so ordered as to exploit the increasing speed of access for later data on a constant angular velocity (CAY) medium.
The layout file 40 serves as one input into the build tool 60. Initially the layout file 40 is parsed 61 by the composite data file and encrypted directory file build tool 60, and an unencrypted version of the directory file 62 is generated. The entries within this unencrypted directory file 62 are ordered according to the results of hashing the data sub file 30 filenames with the above mentioned first hash function hi, the division hash function.
It is at this stage that hash collisions are detected and dealt with by using the double offset hash function. The unlikely possibility of double collision exists, such that the twin hash values calculated are the same for a given file name string. This fault (if it occurs) is detected at this stage and rectified by perturbing the hash values by altering the seed constants, k and k', of the hash functions. The information stored in the encrypted directory file 80 includes file size information 160, position offset information 170, and a checksum 150 of the file as a whole using a standard hash function, for example MD5 in the preferred embodiment, however, any other suitable hash function, such as versions of SHA or RIPEMD may equally be used.
The next two stages of the process are decoupled from one another, and therefore may be carried out in any order.
In the preferred embodiment, however, the two stages are carried out at the same time by the respective components of the build tool 60.
The first stage is building the composite data files 70 themselves. To do this, the original data sub files 30 and the unencrypted directory file 62 are inputted into the composite data file build component 65 of the build tool 60.
In the preferred embodiment, due to byte addressing restrictions, the maximum size of an individual composite data file 70 is 2 GB. This maximum size is determined by the maximum number of bytes addressable by a signed 32 bit value, which equates to 231 (i.e. 2GB). The directory information generated by the first stage of the process described earlier is used to create the composite data files 70, each of which consists of a contiguous concatenation of the data sub files 30. The data files are kept contiguous to minimise disc 90 seek times. The composite data file 70 is then padded out to the 2 GB limit using a padding file 66 containing redundant, but plausible, data. The composite data files 70 are the other output of the composite data file and encrypted directory file build process 60.
The second stage involves the encryption of the file information stored in the directory structures by means of using a block cipher. In the preferred embodiment, a simple public domain block cipher called the Tiny Encryption Algorithm (TEA) is used, which enables information to be rapidly encrypted and decrypted using a symmetric 128 bit key. This key is created by check summing 64 the application executable 50. In the preferred embodiment, the standard MD5 hash function is again used. Using this block encryption method, the critical file information stored in the encrypted directory file 80 is encrypted to produce a secure version of the directory file. This is one of the outputs of the composite data file and encrypted directory file build process 60.
In the preferred embodiment, the composite data files 70, encrypted directory file 80, the application executable 50, and all the other files left out of the composite data files 70 for technical reasons, are then burnt onto optical storage media 90 as part of a standard disc authoring process.
The storage media can be any of the currently known storage media 90. For example DVD-R, which has a capacity of 4.3 GB, where typically each disc 90 will contain 3 composite data files 70 (2 of 2GB and 1 of c. 200 MB), the encrypted Directory file 80 (32kB), the Application Executable 50 (usually around 4 MB in size), and several miscellaneous files which are excluded from the composite files 70 due to technical considerations (e.g. libraries, legal readme notices). The disc, if properly prepared is then capable of being run in an appropriate computer system, games console, or similar.
As a side effect, the system of the present invention has the additional benefit of overcoming limitations inherent in the SonyO implementation of the ISO 9660 file system standard (e.g. used by Sony for PS2_ game discs).
This standard insists that file names must be in a 8.3 - 22 format (with an optional version number) so that ABCDEFGH.ABC is the biggest possible file name. Also there can be no more than 40 directories in total on the disc 90 with no more than 30 files and directories in each directory. As the individual components of the composite data file system of the present invention do not break these restrictions, and an individual composite data file can contain several thousand files, effectively any number of files can be included in a project (within reason).
Furthermore the system of the present invention allows data to be distributed efficiently between fixed inaccessible regions of the disc used for proprietary copy protection watermarking e.g. Microsoft_ XboxT, as the maximum size of a contiguous data block is restricted.
Figure 5 shows how an application executable 50 that makes use of the present invention runs 310 on a target computer system 10.
When the disc 90 created as the final outcome of the disc build process 200 is run, the application executable 50 initially checksums itself 320 at an early stage of execution. In the present embodiment, the checksum is carried out using the standard MD5 hashing algorithm, but again, any suitable hash function may equally be used. This allows the checksum used to encrypt the file information to be recovered, so that this information can be retrieved from the encrypted directory file 80, which is loaded and held resident in memory, and unlocked when a file access request is encountered during operation 330 of the application executable 50. Hence once the directory file 80 is encrypted, it is effectively locked to the application executable 50 with the matching checksum.
During the normal operation of the application executable 50, numerous data sub file 30 access calls 400 will be made, and a number of data integrity checks 500 will also be made.
Figure 6 shows the typical process involved in accessing files within one or more of the composite data files 70, using the encrypted directory file 80.
When a file request 405 is made, the filename is hashed using the first of the twin hash functions hi 410 and the resultant value is searched for 415 in the encrypted directory file 80. The entry in this position is then read and the hash value of the filename using the second of the hash functions, h2, is then compared 425 to the stored value. If the value does not match, the double hashing procedure 435 is invoked, and the encrypted directory file is searched repeatedly until the entry with a matching second twin hash function is found 435. If the search loops back to the initial value i.e. the first hash function value, hi the search has failed and a file system error has occurred 445.
When a valid directory entry 100 is found, the encrypted information about the size 160, position 170 in the form of an offset, and checksum 150 of the file is decrypted 440 and is then used to access the file in the relevant composite data file 70. Loading of the file then takes place as normal using a standard method 445.
In the case of multiple files, the file offset 170 of the instance closest to the current read position is calculated to minimise disc seek time. This allows loading optimisations to be made via the ordering of files in the layout file as disc seek times contribute significantly to overall load times.
If the file being loaded matches preset criteria, which in the preferred embodiment is if it is less than 64 kB in size, then an additional integrity check process is undertaken. In this process, when the file is loaded 445 the checksum of the file is calculated 460 and is compared to the value stored in the directory information 470. If the checksums do not match a marker is set 480 (for example, a global Boolean in the case of the preferred embodiment) which can then be acted on by an integrity check 500 at a later date.
Although in the preferred embodiment the contents of each of the composite data files 70 is not encrypted in order to minimise loading times, where such restrictions do not apply, it is possible to implement bulk encryption of the contents of the composite data files 70. This gives the added benefit of making the composite data files 70 both more secure and incompressible, which is an advantage as it increases the bandwidth needed to download a copied file across a network. In this case, there is provision for streamed decryption of files again, for example using the Tiny Encryption Algorithm Block decipher.
Figure 7 shows the outline of the integrity check process that can be invoke at various points in the application executable's 50 execution, provided they meet a preset criteria. As previously described, in the preferred embodiment, this process is invoked when the data sub files being accessed are less than 64kB.
The checks on the integrity of data sub files 30 is performed by means of a comparison 470 of the checksum of the loaded file contents 460 with the stored file checksum when they are loaded.
Throughout normal operation of the application executable 50, a number of data integrity checks 500 are performed. In the case of the preferred embodiment, this is done by checking the value of the global Boolean set during the file load process. If this value is found to have changed to the state which signifies that a data sub file 30 checksum 150 mismatch has been found, action can then be taken to make the application execution 50 fail, either catastrophically or more gradually (in order to complicate attempts to discover the location of the integrity checks within the program 50).
The fact that the encrypted directory 80 is encrypted using a symmetric key that depends on the checksum of the application executable 50 makes it harder to remove the integrity checks as removing them will change the checksum of the application executable 50, and make the encrypted directory file 80 unreadable, thereby making the loading of the data sub files 30 virtually impossible.
Although the above described system has many diverse benefits, problems may occur if the application executable needs to be changed (or "patched") , as this will inevitably change the executable checksum used to decrypt the file directory. However, provided that no data has changed this problem can be circumvented by regenerating the encrypted directory file 80 using the new application executable 50 in the build process 60.
If data sub file 30 changes are minor, that is specifically do not change the file names, sizes or positions of the data sub files 30 within the composite data file(s) 70, then only the data file checksums will need to be recalculated before the updated encrypted directory file is regenerated. Altered data can then be patched over the composite data files 70 without disturbing the positions and sizes of unaltered data.
If extensive changes to data are made, the composite data file 70 will need to completely rebuilt in addition to the encrypted directory file 80. The impact of this can be minimised by placing data likely to be altered in a separate composite data file 70, allowing unaltered data to remain undisturbed in other composite data files 70. Careful thought thus needs to be given to the structure of the layout file used to build the composite data files 70, in order to minimise risks, but the exact details of this will be highly application specific.
Whilst a specific embodiment of the invention has been described. It is to be understood that this is by way of example only and that various modifications may be considered. For example, the data sub files 30 need not be accessed by a filename, but could equally be accessed by another label, such as a number. Equally, although in the described embodiments, the composite data files 70 have been unencrypted, where appropriate, the composite data files 70 may be encrypted. Therefore, the specific embodiment is not to be seen as limiting of the scope of protection, which is instead to be determined by the following claims. - 27
Claims (45)
- CLAIMS: 1. A method of creating a directory structure comprising aplurality of substructures, for use in accessing selected sub files from within a composite data file, each sub file having an associated filename, the method comprising: associating an unique substructure with each sub file by applying a first predetermined function to the associated filename and using the result as an index into the plurality of substructures) and applying a second predetermined function to the associated filename of each sub file and storing the result in a first portion of the substructure associated with the sub file.
- 2. The method of claim 1, wherein the directory structure is used by a predetermined executable to access selected sub files from within a composite data file, and the method further comprises the step of encrypting a second portion of each substructure using an encryption key derived from the predetermined executable using a third predetermined function.
- 3. The method of claim 1 or 2, wherein the first predetermined function includes a first predetermined hash function.
- 4. The method of claim 3, wherein the first predetermined function further includes a first double offset function for use when said first hash function does not produce an unique index for the associated sub file. 28
- 5. The method of claim 3, wherein the first predetermined hash function is a division hashing function
- 6. The method of claim 5, wherein the first predetermined hash function is h1 (x)=(x+k) mod p, in which x is a value derived from the sub file filename, k is a seed constant, and p is an integer value.
- 7. The method of claim 4, wherein the first double offset function is d (x) = (
- 8 - (h1 (x) mod 8) ) . 8. The method of claim 1, wherein the second predetermined function is a second predetermined hash function.
- 9. The method of claim 8, wherein the second predetermined hash function is a Fibonacci hashing function.
- 10. The method of claim 9, wherein the second predetermined hash function is h2(x') = (M/W) * ((a *(x'+ k')) mod W), in which M is a power of two, W is 2n, where n is the word size of the computer, a is 2654435769, x' is a value derived from the sub file filename, k' is a second constant seed value.
- 11. The method of claim 2, wherein the third predetermined function is a third predetermined hash function.
- 12. The method of claim 11, wherein the third predetermined hash function is the MD5 checksum hashing function.
- 13. The method of claim 1, further including the step of concatenating together the sub files into at least one composite data file in accordance with the directory structure.
- 14. The method of claim 1, wherein the step of encrypting a second portion of each substructure comprises using a block encryption algorithm to encrypt the second portion of each substructure.
- 15. The method of claim 1, further including the step of applying a fourth predetermined function to the sub file and storing the result in the second portion of the substructure associated with the sub file.
- 16. The method of claim 1, wherein each sub file has a file size, and the method further includes the step of determining the size of each sub file and storing the result in the second portion of the substructure associated with the sub file.
- 17. The method of claim 13, wherein each sub file has a position offset from the start of the composite data file and the method further comprises the step of storing the position offset of each sub file within a composite data file in the second portion of the substructure associated with the sub file.
- 18. The method of claim 1, wherein the sub file may be in any one of a plurality of composite data files, the method further comprising the step of storing, in the first portion of each substructure, the number of the composite data file containing the associated sub file.
- 19. The method of claim 1, wherein the sub file may be included in at least one composite data file more than once, and the method further includes the step of storing the instance number of the associated sub file.
- 20. The method of claim 1, wherein the first portion of each indexed substructure further includes the number of instances of the associated sub file.
- 21. The method of claim 13, further including the step of adding padding data to the concatenated together sub files to create composite data files of a certain size.
- 22. The method of claim 1, wherein the use of the first predetermined function enables the elimination of characteristic filenames from the directory structure
- 23. A directory structure for a computer, for use by a predetermined executable to access a selected sub file from within a composite data file, each sub file having an associated filename, comprising: a plurality of substructures, each substructure being associated with a particular sub file by a result of applying a first predetermined function to the filename associated with the sub file, each substructure including a first portion containing the result of applying a second predetermined function to the filename of the associated sub file, and a second portion encrypted using a key derived from applying a third predetermined function to the predetermined executable.
- 24. A computer readable medium, comprising: 31 at least one composite data file; and a directory structure constructed according to any of claims 1 to 22.
- 25. A computer readable medium, comprising: at least one composite data file; and the directory structure of claim 23.
- 26. A method of enabling a predetermined executable to access a sub file located within a composite data file, which includes a plurality of sub files, each sub file having an associated filename, the method comprising: locating a substructure uniquely associated with a sub file by applying a first predetermined function to the filename associated with the sub file and using the result as an index into the plurality of substructures; applying a second predetermined function to the filename associated with the sub file being accessed, and comparing the resultant value with a value stored in a first portion of the substructure indexed by the result of first step; decrypting a second portion of the said substructure using a key derived from applying a third predetermined function to the predetermined executable, when the two values compared in the second step are equal; and loading the sub file using information contained within the decrypted second portion of the substructure associated with the sub file being accessed.
- 27. The method of claim 26, further comprising the step of applying a predetermined offset function to the filename of 32 the sub file being accessed when the two values compared in the second step are not equal.
- 28. The method of claim 26, further comprising the step of comparing a file size of the sub file stored in the second portion of the associated substructure with a predetermined file size, and checking the integrity of the sub file if the file size of the sub file is found to be below the predetermined filesize.
- 29. The method of claim 28, further comprising applying a fourth predetermined function to the sub file loaded to produce a checksum of the sub file, comparing the checksum of the sub file produced with a checksum of the sub file stored within the second portion of the substructure associated with the sub file, and setting a marker dependent on the outcome of the comparison.
- 30. The method of claim 15 or 29, wherein the fourth predetermined function is a hash function.
- 31. A composite data file for access by an executable program, comprising a plurality of sub files, wherein the sub files are located and loaded into memory using information held within a directory file constituted according to any of claims 1 to 22.
- 32. Apparatus for building a directory structure comprising a plurality of substructures, for use by a predetermined executable to access selected sub files from within a composite data file, each sub file having an associated filename, the apparatus comprising: means for applying a first predetermined function to a filename associated with a sub file and using the result as an index into the directory substructures; means for applying a second predetermined function to a filename associated with a sub file and storing the result in a first portion of each associated substructure; and means for applying a third predetermined function to the predetermined executable and encrypting a second portion of each substructure dependent on the result of applying the third predetermined function to the predeterminedexecutable.
- 33. The apparatus of claim 32, further comprising: means to concatenate sub files together into a composite data file in accordance with information stored within the directory structure.
- 34. The apparatus of claim 33, further comprising means for applying padding data to the sub file data;
- 35. A method of testing the integrity of sub files loaded by a predetermined executable, the method comprising: loading a sub file; calculating a checksum of the sub file; comparing the calculated checksum with a checksum previously stored for said sub file; setting a marker if said calculated checksum does not equal the stored checksum for the sub file.
- 36. A method of linking a predetermined executable program to a plurality of sub files to be accessible by the executable during execution by a computer, the method comprising: concatenating the plurality of sub files into a composite data file; creating a directory structure for accessing the individual sub files within the composite data file, wherein the directory structure includes a plurality of substructures, each substructure being uniquely associated with a particular sub file; storing information necessary to locate a sub file within the composite data file in the substructure associated with each sub file; deriving an encryption key from the predeterminedexecutable; andencrypting at least a portion of the directory substructure containing said information using the encryption key, to thereby restrict access to the sub file.
- 37. A method of creating a data structure comprising a plurality of substructures, each substructure having an associated reference, the method comprising: associating an unique substructure with each reference by applying a first predetermined function to the reference and using the result as an index into the plurality of substructures; and applying a second predetermined function to each reference and storing the result in the substructure associated with the reference.
- 38. The method of claim 37, wherein the data structure is used to access data sub files, each data sub file associated with an unique substructure. t - 35
- 39. The method of claim 37, wherein the first predetermined function comprises a first predetermined hash function.
- 40. The method of claim 39, wherein the first predetermined function further includes a first double offset function for use when said first hash function does not produce an unique index for the associated data file.
- 41. The method of claim 39, wherein the first predetermined hash function comprises a division hashing function.
- 42. The method of claim 37, wherein the second predetermined function comprises a second predetermined hash function.
- 43. The method of claim 42, wherein the second predetermined hash function comprises a Fibonacci hashing function.
- 44. A method of restricting access to data to a predetermined executable or holder of a said executable, wherein the data is required by the predetermined executable during execution, by producing circular closure giving rise to mutual dependency between directory information required to access the data and the predetermined executable, the method comprising encrypting the directory information used to locate the data using a key derived from the checksum of the predetermined executable.
- 45. The method of claim 44, further comprising the step of carrying out data integrity checks on the data required by t - 36 the predetermined executable when said data is loaded by said predetermined executable.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| GB0421636A GB2418748B (en) | 2004-09-29 | 2004-09-29 | Directory structures for composite data files |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| GB0421636A GB2418748B (en) | 2004-09-29 | 2004-09-29 | Directory structures for composite data files |
Publications (3)
| Publication Number | Publication Date |
|---|---|
| GB0421636D0 GB0421636D0 (en) | 2004-10-27 |
| GB2418748A true GB2418748A (en) | 2006-04-05 |
| GB2418748B GB2418748B (en) | 2010-06-09 |
Family
ID=33397453
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| GB0421636A Expired - Fee Related GB2418748B (en) | 2004-09-29 | 2004-09-29 | Directory structures for composite data files |
Country Status (1)
| Country | Link |
|---|---|
| GB (1) | GB2418748B (en) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP2254118A1 (en) * | 2009-05-20 | 2010-11-24 | Sony DADC Austria AG | Method for copy protection |
| US8717857B2 (en) | 2009-05-20 | 2014-05-06 | Sony Dadc Austria Ag | Method for copy protection |
| US9263085B2 (en) | 2009-05-20 | 2016-02-16 | Sony Dadc Austria Ag | Method for copy protection |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20020003886A1 (en) * | 2000-04-28 | 2002-01-10 | Hillegass James C. | Method and system for storing multiple media tracks in a single, multiply encrypted computer file |
| US20030070071A1 (en) * | 2001-10-05 | 2003-04-10 | Erik Riedel | Secure file access control via directory encryption |
| GB2382179A (en) * | 1999-09-07 | 2003-05-21 | Emc Corp | System and method for secure storage, transfer and retrieval of content addressable information |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CA2516741C (en) * | 2003-02-21 | 2013-10-08 | Caringo, Inc. | Additional hash functions in content-based addressing |
-
2004
- 2004-09-29 GB GB0421636A patent/GB2418748B/en not_active Expired - Fee Related
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| GB2382179A (en) * | 1999-09-07 | 2003-05-21 | Emc Corp | System and method for secure storage, transfer and retrieval of content addressable information |
| US20020003886A1 (en) * | 2000-04-28 | 2002-01-10 | Hillegass James C. | Method and system for storing multiple media tracks in a single, multiply encrypted computer file |
| US20030070071A1 (en) * | 2001-10-05 | 2003-04-10 | Erik Riedel | Secure file access control via directory encryption |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP2254118A1 (en) * | 2009-05-20 | 2010-11-24 | Sony DADC Austria AG | Method for copy protection |
| US8717857B2 (en) | 2009-05-20 | 2014-05-06 | Sony Dadc Austria Ag | Method for copy protection |
| US9013970B2 (en) | 2009-05-20 | 2015-04-21 | Sony Dadc Austria Ag | Method for copy protection |
| US9263085B2 (en) | 2009-05-20 | 2016-02-16 | Sony Dadc Austria Ag | Method for copy protection |
Also Published As
| Publication number | Publication date |
|---|---|
| GB2418748B (en) | 2010-06-09 |
| GB0421636D0 (en) | 2004-10-27 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US7281273B2 (en) | Protecting content on medium from unfettered distribution | |
| US6920565B2 (en) | Method and system for providing secure digital music duplication | |
| JP5034227B2 (en) | Information processing apparatus, information recording medium manufacturing apparatus, information recording medium and method, and computer program | |
| CN101057288B (en) | Method and apparatus for binding content to removable storage | |
| US20050010767A1 (en) | System and method for authenticating software using hidden intermediate keys | |
| CN1575446A (en) | Method for binding a software data domain to specific hardware | |
| NO330422B1 (en) | Encryption for digital rights management, as well as data protection of content on a device without interactive authentication | |
| KR101036701B1 (en) | A system that associates secrets with computer systems that allow for hardware changes. | |
| CN103077333B (en) | A kind of software code protection method under Linux system | |
| WO2007030931A1 (en) | System and method for preventing unauthorized use of digital works | |
| US7685646B1 (en) | System and method for distributing protected audio content on optical media | |
| US20100017624A1 (en) | From polymorphic executable to polymorphic operating system | |
| US20020146121A1 (en) | Method and system for protecting data | |
| US20090285070A1 (en) | Copy-protected optical storage media and method for producing the same | |
| US20040010691A1 (en) | Method for authenticating digital content in frames having a minimum of one bit per frame reserved for such use | |
| KR100573740B1 (en) | Software piracy and illegal usage prevention method and system | |
| US8490208B2 (en) | Method and device for detecting if a computer file has been copied and method and device for enabling such detection | |
| GB2418748A (en) | Directory structures for composite data files | |
| US7672454B2 (en) | Method for copy protection of digital content | |
| JP4941611B2 (en) | Information processing apparatus and method, and computer program | |
| Hyams | Copy Protection of Computer Games |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 732E | Amendments to the register in respect of changes of name or changes affecting rights (sect. 32/1977) |
Free format text: REGISTERED BETWEEN 20120906 AND 20120912 |
|
| PCNP | Patent ceased through non-payment of renewal fee |
Effective date: 20190929 |