HK1056036B

HK1056036B - Music data compression method and apparatus

Info

Publication number: HK1056036B
Application number: HK03108304.1A
Authority: HK
Inventors: 川岛隆宏; 鸟羽伸和
Original assignee: 雅马哈株式会社
Priority date: 2002-03-20
Filing date: 2003-11-14
Publication date: 2006-04-07

Description

Music data compression method and apparatus

Technical Field

The present invention relates to a music data compression method of compressing a format of performance event information formed of information such as tone pitch information, utterance timing, utterance duration, and the number of channels corresponding to a part, and more particularly to a music data compression method suitable for distributing music data such as incoming tunes of cellular phones, for example.

Background

In recent years, the use of networks by electronic devices or apparatuses (terminal apparatuses) such as cellular phones and personal computers has been rapidly developed, and services in which a variety of data contents have been received from servers via such terminal apparatuses have been possible. Examples of the data content include music data uttered as an incoming call tune via a cellular phone and music data of a music piece or karaoke music played via a personal computer.

However, as the reproduction time and/or the number of parts of such a music piece increases, the data size also increases, which leads to an increase in the communication time and the cost required for downloading music data of an incoming tune or the like. Furthermore, the terminal device requires a large storage capacity for storing music data in the device. To overcome these problems, music data is required to be compressed.

Japanese patent laid-open (Kokai) No.8-22281 discloses a method of compressing a MIDI signal by analyzing the MIDI signal as music data to detect a continuously repeated occurrence of a tone or pattern, deleting a part of the music data corresponding to the detected repeatedly occurring tone or pattern, and inserting a signal representing the tone or pattern to be continuously repeated in the music data instead of the deleted part. Another method is disclosed in japanese patent laid-open (Kokai) No.9-153819, which uses a data rearrangement process to decompose MIDI data (consisting of 5 data elements of tone pitch, duration, tone length, velocity, and channel number) of each tone into 5 data elements, and then reassembles pieces of data of the 5 data elements into data of a set of data elements, thereby increasing the compression ratio of a reversible (lossless) compressor in the next stage.

However, according to the method proposed by japanese patent laid-open (Kokai) No.8-22281, when, for example, a key-on event is considered, in MIDI data composed of state information formed of information representing the key-on event and information representing a channel, key number information (7 bits), velocity information (7 bits), and strobe time information (duration information in some cases), the same tone rarely occurs continuously in all kinds of information, resulting in low compression efficiency. Also, although it is contemplated to compress music data of a type containing a predetermined pattern or channel at a high compression ratio, this requires the use of a complicated algorithm to detect long repeated segments.

On the other hand, japanese patent laid-open (Kokai) No.9-153819 proposes a technique for compressing karaoke contents in a communication karaoke system. According to this technique, once the karaoke content subjected to the data rearrangement process is downloaded to a terminal device installed in a karaoke room or home, the respective groups of 5 data elements are rearranged into initial MIDI data for each tone to be used as karaoke data. Thus, this technique is not suitable for stream (stream) reproduction in which reproduction of music data is performed while receiving data from a server via a network. Also, Japanese patent laid-open (Kokai) No.9-153819 discloses only rearranging data elements without proposing any new compression method.

Disclosure of Invention

An object of the present invention is to provide a new music data compression method capable of reducing the size of music data, and a program for executing the method.

In order to achieve the above object, in a first aspect of the present invention, there is provided a music data compression method comprising the steps of: receiving music data containing a sequence of pieces of performance event information, each formed of note information; each piece of performance event information of music data is converted into another form of performance event information, which contains state information corresponding to a matching or mismatching pattern in note information between one piece of performance event information and the previous piece of performance event information and note information required according to the matching or mismatching pattern corresponding to the state information.

According to the method of the first aspect of the present invention, since each piece of performance event information of music data is converted into another form of performance event information, the other form of performance event information contains only mismatching note information and matching note information as compared with the previous piece of performance event information, it is possible to reduce the data size of the music data, and the other form of performance event information contains state information and note information, the state information corresponding to a matching or mismatching pattern in the note information between the piece of performance event information and the previous piece of performance event information, the note information being required according to the matching or mismatching pattern corresponding to the state information. In decompressing the compressed music data, the compressed performance event information may be decompressed into its original form before compression by referring to note information of the previous performance event information according to a matching or mismatching pattern indicated by its state information.

Preferably, the note information includes tone pitch (tone pitch) information, and the converting step includes the step of expressing the tone pitch information included in the note information by a difference from a predetermined initial tone pitch.

According to this best form, the compressed tone pitch information (difference) is a correlation value of the initial tone pitch, and therefore it is possible to make the data size of the music data smaller than when the tone pitch is expressed in its absolute value throughout the range of notes.

It is preferable that the initial tone pitch be a tone pitch of any piece of performance event information contained in the music data.

Since the initial tone pitch can be set to the tone pitch of the first tone of the music data, the tone pitch of the prescribed tone of the music data, or the intermediate tone pitch between the highest tone and the lowest tone pitch of the music data as such, it is possible to make the data size of the music data smaller.

Preferably, the music data received in the receiving step includes a series of pieces of performance event information of a plurality of channels, and the converting step includes: configuring a plurality of pieces of performance event information for all channels in a time-series order; and detecting a match or a mismatch in note information between each piece of performance event information configured for all channels in time series order and the previous piece of performance event information.

With the more preferable method according to the first aspect, since the music data is compressed according to the detection of a match or a mismatch in note information between each piece of performance event information and the previous piece of performance event information arranged for all channels in time series order, the compressed music data further contains a plurality of pieces of performance event information arranged for all channels in time series order. As a result, for example, when the compressed music data is distributed, it is possible to reproduce music from the music data while receiving the music data (i.e., performing streaming reproduction).

Alternatively, in the method according to the first aspect, the music data received in the receiving step includes a sequence of pieces of performance event information for a plurality of channels, and the converting step includes: arranging a plurality of pieces of performance event information in a time-series order on a channel-to-channel basis; and detecting a match or mismatch in note information between each piece of performance event information and the previous piece of performance event information arranged in time series order on a channel-to-channel basis.

Since the music data is thus compressed according to the detection of a match or mismatch in note information between each piece of performance event information and the previous piece of performance event information arranged in time series order on a channel-channel basis, each piece of performance event information contained in the music data on a channel-channel basis does not necessarily contain channel information, which can reduce the data size of the music data.

It is preferable to set an initial tone pitch for each channel, and tone pitch information is expressed by each channel.

According to this best mode, even in the case where tone pitch information is expressed for all channels for the difference in the individual initial tone pitch, the music data contains a piece of tone pitch information that cannot be expressed with a predetermined data length, the initial tone pitch is set for each channel, and as a result, the possibility that the difference in the initial tone pitch that can be set for the corresponding channel represents such note information increases, which can further reduce the data size of the music data.

It is preferable that the initial tone pitch includes a single tone pitch, and the difference for the single initial tone pitch represents tone pitch information for each piece of performance event information for all channels arranged in time series order.

Since the initial tone pitch includes a single tone pitch, the compression process can be simplified.

It is preferable that the note information of each piece of performance event information includes interval information representing an interval from the previous piece of performance event information and utterance duration information of the performance event associated with the piece of performance event information, and the conversion step includes the interval information and the utterance duration information expressed by a predetermined note length.

Since the interval information and the utterance duration information are expressed for their respective predetermined note lengths as such or are made similar to the predetermined note lengths, it is possible to reduce the data size of the music data.

In order to achieve the above object, in a second aspect of the present invention, there is provided a program for causing a computer to execute a music data compression method, the program including: a receiving module for receiving music data containing a sequence of pieces of performance event information, each formed of note information; and a conversion module for converting each piece of performance event information of the music data into another form of performance event information, which contains state information corresponding to a matching or mismatching pattern in the note information between the performance event information and the previous piece of performance event information and note information required according to the matching and mismatching patterns corresponding to the state information.

The above and other objects, features and advantages of the present invention will become more apparent from the following detailed description when taken in conjunction with the accompanying drawings.

Drawings

Fig. 1A and 1B are diagrams of examples of music data before and after compressing a performance event data sequence performed by a music data compression method according to an embodiment of the present invention;

fig. 2A and 2B are diagrams of examples of music data formats during compression in the music data compression method according to the present embodiment;

fig. 3A and 3B are diagrams of two types of music data formats to be compressed by the music data compression method according to the present embodiment;

fig. 4 is a system basic configuration diagram to which the music data compression method according to the present embodiment is applied;

FIG. 5 illustrates a specific example of the construction of the system of FIG. 4;

FIG. 6 is a block diagram showing the hardware configuration of a computer provided in the distribution server in FIG. 5;

FIG. 7 is a block diagram of the hardware configuration of the cellular phone of FIG. 5;

fig. 8 is a flowchart of the main part of the music data compression process performed by the distribution server in fig. 5;

fig. 9 is a flowchart of a single channel block compression process subroutine performed by the music data compression method according to the present embodiment;

fig. 10 is a flowchart of a multi-channel block compression process subroutine executed by the music data compression method according to the present embodiment; and

fig. 11 is a flowchart of a main part of a music data distribution process performed by the music data compression method according to the present embodiment.

Detailed Description

The present invention will now be described in detail with reference to the accompanying drawings of preferred embodiments thereof.

Referring first to fig. 4, there is shown a basic system configuration to which the music data compression method according to the embodiment of the present invention is applied. As shown in fig. 4, the data generating and distributing side such as a distribution server (computer) compresses music data in a Standard MIDI File (SMF) format or compresses music data in a format (SMAF format) utilized by the present embodiment into compressed music data in a format specified by the present embodiment.

The data receiving and reproducing side such as a cellular phone downloads music data compressed at the data generating and distributing side to store the music data in a memory of the data receiving and reproducing side. During reproduction of music data, a cellular phone or the like as a data receiving and reproducing side decompresses the music data stored in the memory and performs sequencer (sequencer) processing to transfer key-on data, note number data, key data, and the like to a predetermined channel of the tone generator for sounding the music data. It should be noted in the following description that the data generation and distribution side is referred to as "distribution server", the data reception and reproduction side is referred to as "cellular phone", and the music data is referred to as "incoming call tune data".

Fig. 5 illustrates an example of the composition of the system of fig. 4. In fig. 5, a distribution server 1 for distributing incoming call tune data, a station 3 connected with a user's cellular phone 2 by wireless, and a personal computer 4 compatible with a network are connected with a network 10, and the network 10 is a telephone switching network.

Fig. 6 is a block diagram showing a hardware configuration of a computer provided in the distribution server 1 in fig. 5. As shown in fig. 6, the CPU1a operates on the OS installed in the external storage 1b, and executes various processes with the ROM1c and the RAM1 d. The external storage 1b is realized by, for example, a large-capacity HDD (hard disk drive), and stores various data such as various music data in SMF format and SMAF format, various information (sources of a plurality of WWW pages formed by HTML), and the like. Further, the communication interface 1e is connected to the network 10, and the distribution server 1 provides a service for distributing music data of an incoming call tune to the cellular phone 2 or the like of the user. In distributing the music data, the CPU1a executes compression processing, communication processing, and the like for compressing the music data, as described in detail below. The distribution server 1 further includes a display 1f, a keyboard 1g, and a mouse 1h as input/output devices for the operator.

Fig. 7 is a block diagram showing the hardware configuration of the cellular phone 2 in fig. 5. As shown in fig. 7, the CPU2a controls the overall operation of the cellular phone 2 by executing programs stored in the ROM2 b. The RAM2c is used as a work area of the CPU2a, a storage area for storing music data downloaded by the distribution service, and a storage area for storing constituent data set by the user. The communication section 2d demodulates (modulates) the signal received by the antenna 2e, modulates (modulates) the signal for transmission, and supplies the demodulated signal to the antenna 2 e. The voice processing section 2f decodes the received voice signal demodulated by the communication section 2d, and the music reproducing section 2g outputs the decoded received voice signal as sound via the received voice speaker 2 i. Also, the music reproducing section 2g reproduces music from the music data of the incoming call tune, and then outputs the music of the incoming call tune via the incoming call speaker 2 j. The voice signal inputted through the microphone 2h is compressed and encoded by the voice processing section 2f, and the encoded voice signal is transmitted by the communication section 2 d.

The interface (I/F)2k is used to download music data, tone color data, and the like from an external device or apparatus such as a personal computer. The cellular phone 2 includes: an input part 2m composed of a dial button and a plurality of operation elements; a display section 2n that performs predetermined display according to an operation for selecting an assignment service and an operation such as a dial button; a vibrator 2p for vibrating the main body of the cellular phone instead of generating an incoming call sound when receiving the phone.

The music reproducing section 2g reproduces a music piece by reading the performance event data from the music data buffer provided in the section 2 g. When a free or usable area of a predetermined size is generated in the music data buffer during reproduction of a music piece, an interrupt request signal is transmitted to the CPU2 a. In response to the request signal, the CPU2a reads the music data still in the music data buffer from the compressed music data stored in the RAM2c, and transfers the compressed music data read from the RAM2c to the music reproducing section 2 g. The music data is decompressed by the CPU2a before being supplied to the music reproducing section 2 g. The timing of performing decompression depends on the format to be used below. It should be noted that the music reproducing section 2g includes a tone generator which generates music signals for a plurality of channels by time-division multiplexing, and reproduces an incoming call tune in accordance with a performance event in music data in the same manner as music reproduction of an automatic performance. A technique of reproducing automatic performance music from music data is known, and thus will not be described in detail.

Fig. 3A and 3B show two music data formats to be compressed according to the present embodiment. Fig. 3A and 3B each show a single file for the same single-music-piece music data. Generally, music data as shown in the figure is composed of data for a plurality of channels (16 channels in the case of SMF) corresponding to respective sections. In the present embodiment, it is assumed that music data is formed of data for 4 channels, and therefore it is possible to produce music of 4 timbres at maximum. Also, in the present embodiment, music data is formed in one of two channel formats, i.e., a mono block format and a multi-channel block format, which are different from each other in channel management.

In the monaural block format shown in fig. 3A, data of "data length", "tempo", and "timbre" are recorded as a title, and data of "performance event" corresponding to the tone of a music piece (performance event information) is recorded after the title. The "data length" is 8-bit data representing the entire data length, the "tempo" is 4-bit data representing the music reproduction tempo, and each "tone" is 6-bit data representing the tone assigned to the channels ch1 to ch4, respectively.

Each "performance event" data is composed of data (note information) of "channel", "note number", "duration", and "gate time". The "channel" data is 2-bit data representing the number of channels of a channel to which a performance event belongs, the "note number" data is 6-bit data representing pitch of a tone, the "duration" data is data representing a time interval of 2 bytes from a previous event (i.e., note length), and the "gate time" data is 1 to 2-byte data representing a sounding duration.

In the multi-channel block format as shown in fig. 3B, data of "data length" and "tempo" is recorded as a title, and 4 specific channel groups (blocks) of data are recorded on a channel-by-channel basis behind the title for 4 channels. The number of channel specifying blocks may be less than 4. Each block of data contains "channel" data representing the number of channels of the block, "data length" data representing the entire data length of the block, and "timbre" data representing timbres assigned to the block (channel), which are recorded as a title, with "performance event" data corresponding to the respective tones of the music piece recorded after the title.

Each "performance event" data is composed of "note number", "duration", and "strobe time" data, similar to the mono block format of fig. 3A. However, in the case of the multi-channel block format, the number of channels is set on a block-by-block basis, and thus each "performance event" data does not contain "channel" data. Each duration data represents a time interval from a previous event, and therefore, the first tone (performance event) duration corresponding to the entire piece of music takes a value of 0. On the other hand, the first tone duration data of the other channels than the first tone of the entire piece of music takes a value other than 0.

It should be noted that each of the "note number" data is composed of 2-bit "block" data representing octaves and 4-bit "note" data representing a pitch name in a mono block format or a multi-channel block format as shown in table 1 below.

TABLE 1

Block (2 position)	Block name	Musical note (4 bit)	Name of pitch
Block (2 position)	Block name	Musical note (4 bit)	Name of pitch	00b	Block 1	0000b	C
01b	Block 2	0001b	C#	00b	Block 1	0000b	C
01b	Block 2	0001b	C#	10b	Block 3	0010b	D
11b	Block 4	0011b	D#	10b	Block 3	0010b	D
11b	Block 4	0011b	D#			0100b	E
0101b	F					0100b	E
0101b	F	0110b	F#
0111b	G	0110b	F#
0111b	G	1000b	G#
1001b	A	1000b	G#
1001b	A	1010b	A#
1011b	B	1010b	A#

The music data having the above-described format structure is processed by the distribution server 1 in the following manner: the distribution server 1 stores music data of a plurality of music pieces as a source of recording in a mono block format and a multi-channel block format. These music data are compressed and converted into compressed data in a mono block format and compressed data in a multi-channel block format. In the present embodiment, the number of notes included in each performance event is converted into data representing the difference from the initial number of notes before compressing the music data.

Hereinafter, the first music data compression method according to the present embodiment is described.

First, the note number of the first performance event of the same channel is detected and set as the initial note number (initial pitch). Then, the respective numbers of notes of the second and following performance events of the same channel are each converted into a form of difference representing the difference (pitch difference) from the initial number of notes. The conversion of the note number into each difference form is performed on a channel (hereinafter also referred to as "channel within the allowed note range") in which the difference between the note numbers of all performance events can be represented by 5 bits, but is not performed on at least one performance event channel whose difference between the note number and the initial note number cannot be represented by 5 bits. In this case, in order to distinguish the channels in which the conversion of the note number into the respective difference forms is not performed, it is only required to use data of a predetermined specified kind (except the note number) instead of the initial note number of the channel. Then, the "duration" and "gate time" data of each performance event are each adjusted to one of the predetermined note lengths in table 2 closest to the performance event, and converted into 3-bit data corresponding to the predetermined note length.

TABLE 2

Duration/strobe time (3 bit)	Note length
Duration/strobe time (3 bit)	Note length	000b	Whole musical note
001b	Seminote	000b	Whole musical note
001b	Seminote	010b	Quarter note
011b	Octant note	010b	Quarter note
011b	Octant note	100b	Eight-quarter triple note
101b	Sixteen-minute musical note	100b	Eight-quarter triple note
101b	Sixteen-minute musical note	110b	Sixteen-point three-linkage note
111b	Thirty-two notes	110b	Sixteen-point three-linkage note

The above-described processing converts the music data in fig. 3A and 3B into respective formats shown in fig. 2A and 2B, for example. More specifically, in the mono block format of the compressed data shown in fig. 2A, the note number of the first note of each of the channels (ch1 through ch4) is added to the title data as the "initial note number". Each "performance event" data is composed of 2-bit "channel" data, 5-bit "note message" data representing the difference between the number of notes of the performance event and the number of initial notes, 3-bit data of "duration", and 3-bit data of "gate time".

On the other hand, in the multi-channel block format of compressed data shown in fig. 2B, in the block of each channel, the number of notes of the first pitch of the channel is added as the "number of initial notes" to the header data. Further, each "performance event" data is composed of: 5-bit "note message" data representing the difference in the number of notes from the initial number of notes, 3-bit "duration" data, and 3-bit "strobe time" data. Needless to say, in the "performance event" data corresponding to the first tone, the "note message" data is recorded as "00000 b" (meaning that the difference is 0).

Although the number of notes of the first performance event in each channel is set as the initial note number (initial pitch) in the above-described process, this is not limitative, but an arbitrary note number in each channel may be set as the initial note number. Alternatively, the additionally inputted and set note number may be used as the initial note number.

Hereinafter, the second music data compression method of the present embodiment is described.

The second music data compression method is different from the first music data compression method that compresses music data on a channel-to-channel basis, in that music data is compressed for all channels. More specifically, first, performance events of all channels are checked, and the number of notes of the first performance event of all channels is detected and set as the initial note number (initial pitch). Then, the note number of each channel performance event is converted into a difference form representing the difference (pitch difference) from the initial note number. The conversion of the note number into the respective difference forms is performed only when the difference of the note numbers of all performance events of all channels can be represented by 5 bits. Then, similarly to the first method, the "duration" and "gate time" data of each performance event are adjusted to a predetermined note length closest to the note length of the performance event in table 2 and converted into 3-bit data corresponding to the predetermined note length.

Also in the second music data compression method, an arbitrary note number of all channels may be set as an initial note number (initial tone pitch), or an arbitrarily input and set note number may be used as the initial note number.

Through the above-described processing, in the monaural block format, "initial note numbers" of all channels are added as the title data, and each performance event data is composed of, similarly to the format of fig. 2A: 2-bit "channel" data, 5-bit "note message" data representing the difference in the number of notes of the performance event from the number of initial notes, 3-bit "duration" data, and 3-bit "strobe time" data.

Since the number of notes is each converted into a difference from the above-described initial note number, each music data is compressed as compared with a case where the pitch is expressed by the absolute value of the entire note range (note number). Also, each of the "duration" and "strobe time" data is expressed by a value that is tuned to a predetermined note length, which makes it possible to further compress the music data. Further, when the mono block format is converted into the multi-channel block format, data of each performance event of the multi-channel block format does not need "channel" data, which makes it possible to obtain compressed data.

In the present embodiment, the above-described various compression processes are performed, and it is possible to more significantly compress music data by performing compression processing on a performance event data sequence in the following manner: comparing the data of two adjacent performance events serves to detect a match or mismatch in each of the data of the following performance event and the previous performance event. Then, 3-bit "state" data corresponding to the detected match/mismatch pattern is added to the following performance event data so that the desired data (mismatch data) is left on the following performance event data only in accordance with the match/mismatch pattern, thereby compressing the performance event data.

It should be noted that in the mono block format, the sequence of performance event data in the entire file (i.e., the sequence of all-channel performance event data) is compressed, whereas in the multi-channel block format, the sequence of performance event data for each channel (each block) is compressed on a channel-to-channel (block-to-block) basis. As described above, for a channel in which the difference between at least one note number and the initial note number cannot be expressed with 5 bits, the note number is not converted into a difference form, and the above-described compression processing is performed on this type of channel, particularly a channel in a mono block format, in a manner different from other channels. In this case, comparison between adjacent performance event data is performed for detecting a match or mismatch between note numbers (blocks and notes) thereof.

Table 3 shows the states, the match/mismatch condition, the data after the states, and the total number of bits of the performance event data corresponding to each state. It should be noted that the example of table 3 shows the case where the number of notes is converted into the difference form.

TABLE 3

Single sound channel program block format
Single sound channel program block format				State (3 bit)	Condition	The following data (required data)	Total number of bits
000b	Match with previous event at B, C, D	A	5	State (3 bit)	Condition	The following data (required data)	Total number of bits
000b	Match with previous event at B, C, D	A	5	001b	Match with previous event at C, D	A，B	10
010b	Match with previous event at A, B	C，D	9	001b	Match with previous event at C, D	A，B	10
010b	Match with previous event at A, B	C，D	9	011b	Match with previous event at C	A，B，D	13
100b	Match with previous event at B	A，C，D	11	011b	Match with previous event at C	A，B，D	13
100b	Match with previous event at B	A，C，D	11	101b	Match with previous event at A	B，C，D	14
110b	At all events different from the previous event	A，B，C，D	16	101b	Match with previous event at A	B，C，D	14
110b	At all events different from the previous event	A，B，C，D	16	111b	Connection (tie)/copy (slur) processing with previous events	A，B，C，D	16

A: sound channel (2 position)

B: note message (5 bit)

C: duration (3 bit)

D: gating time (3 bit)

For example, when a performance event is identical to a previous performance event in the "note message", "duration" and "gate time" listed in the uppermost row, the "channel" data is left only after the "state" data to form a compressed performance event. In this case, the total number of performance event bits of 5 bits is obtained by adding 2-bit "channel" data to 3-bit "state" data. Similarly, in the second row and the other rows, only the "state" data is followed by the required data (mismatch data) to form the compressed performance event. As a result, each performance event is compressed to a maximum of 16 bits and a minimum of 5 bits. It should be noted that the condition of being connected to each other with one or more commas, such as "B, C, D", is an "AND" condition, shown in the condition column of the table.

Table 4 shows the status, the match/mismatch condition, the data after the status, and the total number of bits of the performance event data. The example of table 4 also shows the case where the number of notes is converted into a difference form.

TABLE 4

Multi-channel block format
Multi-channel block format				State (3 bit)	Condition	The following data (required data)	Total number of bits
000b	Match with previous event at B, C, D	-	3	State (3 bit)	Condition	The following data (required data)	Total number of bits
000b	Match with previous event at B, C, D	-	3	001b	Matching with previous events at B, D	C	6
010b	Match with previous event at C, D	B	8	001b	Matching with previous events at B, D	C	6
010b	Match with previous event at C, D	B	8	011b	Match with previous event at B	C，D	9
100b	Match with previous event at C	B，D	11	011b	Match with previous event at B	C，D	9
100b	Match with previous event at C	B，D	11	101b	Match with previous event at D	B，C	11
110b	At all events different from the previous event	B，C，D	14	101b	Match with previous event at D	B，C	11
110b	At all events different from the previous event	B，C，D	14	111b	Connection (tie)/copy (slur) processing with previous events	B，C，D	14

B: note message (5 bit)

C: duration (3 bit)

D: gating time (3 bit)

Each table element of table 4 is similar to table 3 and thus is not described in detail. However, as will be appreciated from table 4, in the multi-channel block format, each performance event does not contain "channel" data, which makes the performance event data shorter than the mono block format.

Fig. 1A and 1B each illustrate an example of music data before and after compression of a performance event sequence performed with the present embodiment. In the figure, "a" represents "channel" data, "B" represents "note message" data, "C" represents "duration" data, "D" represents "gate time" data, and "St" represents "state" data. In the mono block format shown in fig. 1A, the "channel" data, the "note message" data, the "duration" data, and the "strobe time" data constitute data of one performance event before compression, and the data of the performance event has 13 bits in total. If in the first performance event of fig. 1A, the "state" data thereof assumes "000" (corresponding to the uppermost row in table 3) according to the matching or mismatching condition (pattern) with the previous performance event, the performance event is represented only by the "state" data and the remaining "channel" data, and is thus compressed into a 5-bit performance event. Also, a performance event whose "state" data assumes "001" is compressed into 10-bit data, and a performance event whose "state" data assumes "010" is compressed into 9-bit data.

On the other hand, in the multi-channel block format shown in fig. 1B, the "note message" data, the "duration" data, and the "strobe time" data constitute data of one performance event before compression, the performance event being 11 bits in total. If the "state" data thereof in the first performance event of fig. 1B takes "000" (corresponding to the uppermost row in table 4) according to the matching or mismatching condition (pattern) with the previous performance event, the performance event is represented by only the "state" data, and thus it is compressed into 3-bit performance events. Similarly, a performance event whose "state" data assumes "001" is compressed into 6-bit data, and a performance event whose "state" data assumes "010" is compressed into 8-bit data.

Fig. 8 is a flowchart showing a main part of the music data compression process performed by the distribution server 1 shown in fig. 5. Fig. 9 is a flowchart showing a subroutine for performing a mono block compression process, and fig. 10 is a flowchart showing a subroutine for performing a multi-channel block compression process. Also, fig. 11 is a flowchart showing a main part of the music data distribution process. It should be noted that the processes shown in these flowcharts utilize the first music data compression method. Referring first to fig. 8, when the music data compression process is started, predetermined SMF or SMAF data is designated in step S1, and a mono block compression process shown in fig. 9 is performed on the predetermined data to store the compressed data in step S2. Then, at step S3, the multichannel block compression process shown in fig. 10 is performed on the predetermined data to store the compressed data. In the following step S4, it is determined whether the process is completed, and if necessary, the above steps are repeatedly performed to store the compressed music data in the external storage 1 b.

In the monaural block compression process shown in fig. 9, the music data format to be compressed is determined in step S21. When it is determined at step S21 that the music data is in the multichannel block format, the process proceeds to step S22, and if the music data is in the mono block format, the process proceeds to step S23. In step S22, the data of all the channels are converged, and all the channel performance events are arranged in time series order based on the duration data of all the channel performance events. Then, the number of channels of the corresponding initial channel is added to each performance event, and the duration of the performance event is changed so as to correspond to the interval between performance events caused by arranging the data of all channels. In short, in step S22, the multi-channel block format is converted into the mono block format. Then, the process proceeds to step S23.

At step S23, the number of initial notes for each channel is detected and stored, and at step S24, it is detected whether the difference between the number of each channel and the corresponding number of initial notes can be expressed with 5 bits on a channel-to-channel basis. In short, it is detected whether each channel is within an allowable range of notes. Then, in step S25, each channel note number within the note range is allowed to be converted into data (note message) representing the difference from the corresponding initial note number, and they are stored. Then, in step S26, each of the performance event data is converted according to the corresponding condition in table 3 by comparing the performance event with the previous performance event. More specifically, if a performance event contains any data matching a previous performance event, the data is deleted from the current performance event, and a state corresponding to a condition (matching/mismatching pattern) satisfied by comparing two performance events is added, thereby forming one performance event. This processing is performed for all performance events of the music data, and when the processing for all performance events is completed, the process returns to the main routine (routine) shown in fig. 8.

In the multichannel block compression process shown in fig. 10, the music data format to be compressed is detected at step S31. When it is determined at step S31 that the music data is in the mono block format, the process proceeds to step S32, and if the music data is in the multi-channel block format, the process proceeds to step S33. In step S32, performance events are classified into a plurality of channel blocks each forming a set of the same channel based on the channel data contained in the respective performance events, thereby classifying the data into groups of data for the respective channels. Then, the duration of each performance event is changed so that it is equal to the previous performance event interval from the same channel. In brief, in step S32, the mono block format is converted into the multi-channel block format. Then, the process proceeds to step S33.

At step S33, the initial note number of each channel is detected and stored, and at step S34, it is detected whether each channel is within an allowable note range. Then, in step S35, each channel note number within the allowable note range is converted into data (note message) representing the difference from the corresponding initial note number, and they are stored. Next, at step S36, each performance event data of each channel is converted according to the corresponding condition in table 4 by comparing the performance event with the previous performance event. More specifically, if a performance event contains any matching data with a previous performance event, the data is deleted from the current performance event, and a state corresponding to the condition satisfied by comparing two performance events is added, thereby forming one performance event. This processing is performed for all performance events of the music data, and when the processing for all performance events is completed, the process returns to the main routine shown in fig. 8.

The music data distribution process shown in fig. 11 is executed whenever the cellular phone 2 accesses the distribution server 1, for example, to receive a service for distributing a plurality of contents. First, it is determined (monitored) whether the user (the cellular phone 2) requests the allocation of the incoming call tune at step S41. If the allocation request has been made, the song title list Web page file for displaying the compressed music data stored in the external storage 1b and the selected input screen are transmitted to the cellular phone 2 at step S42. As a result, a list of tune names and a selected input screen are displayed on the display of the cellular phone 2, thereby enabling the user to selectively input his/her desired tune name and the type of required compression, i.e., one of mono block compression and multi-channel block compression, with the cellular phone 2.

In this state, selective input of the name of the music and the compression type is monitored at step S43, and if the selective input is made, the selected compression type is determined at step S44. If the selected compression type is the multi-channel block compression, the selected music compressed by the multi-channel block compression method is distributed or transmitted at step S45, after which the process proceeds to step S47. On the other hand, if the selected compression type is the monaural block compression, the music data of the selected music piece compressed by the monaural block compression method is distributed or transmitted at step S46, after which the procedure proceeds to step S47. Then, other processing including the accounting process is performed at step S47, after which the distribution process is terminated.

Although the initial note numbers are used for the respective channels in the mono block compression process shown in fig. 9 in a manner consistent with the first music data compression method, this is not restrictive, but a single note number such as a note number arbitrarily selected from all channels, a predetermined note number, an external note number, or an average note number between the highest tone pitch and the lowest tone pitch of all channels may be set as the common initial note number (initial tone pitch) of all channels. Also, although the number of initial notes is used in each channel in the multi-channel block compression process of fig. 10, this is not limitative, but any one of the numbers of initial notes may be used as the number of initial notes.

When the compressed music data is distributed to the cellular phone 2 as described above, the music data is stored in the RAM2c of the cellular phone 2. In the cellular phone 2, when an operation for confirming an incoming call tune is performed or an incoming call occurs in a normal mode, a music piece is reproduced from music data. In order to reproduce a music piece while reading a performance event of music data from the RAM2c, the CPU2a performs the following processing;

when reproducing music from music data compressed by a monaural block compression method, "tone" data of the respective channels ch1 to ch4 are read and set to the tone generator of the music reproducing section 2g, and then "initial note numbers" of the respective channels are read. Then, the first performance event formed of "channel", "note message/note number", "duration", and "gate time" is read. Then, the "note message" of the first performance event data is added to the channel "initial note number" corresponding to the "channel" data contained in the first performance event, and the data obtained by this addition is transmitted as "note number" data to the music reproducing section 2g together with the "channel", "duration", and "gate time" data. It should be noted that if the channel for the performance event is not within the allowable note range, the performance event data is thus transmitted to the music reproducing section 2 g.

The performance event bit lengths after the first performance event are different, and therefore, reading the state data of each performance event is used to distinguish or determine the data forming the performance event. Then, each data deleted by the compression processing is restored (decompressed) to the corresponding data contained in the previous performance event as the same data. Also, the "note message" is added to the "initial note number", and then the "channel", "note number", "duration", and "gate time" data are transmitted to the music reproducing section 2 g. This process is repeatedly executed for the performance event while sequentially reading the data thereof until a predetermined amount of data is written in the data buffer of the music reproducing section 2 g.

When reproducing music from music data compressed by a multi-channel block compression method, 4 indicators corresponding to respective channels ch1 to ch4 are set, and then "timbre" data is read on a channel-by-channel basis and set to a tone generator of the music reproducing section 2g, and then "initial note number" for each channel is read. Next, the respective indicators of the channels are updated so that the performance events are read on a channel-to-channel basis. Similar to the above-described processing for music data compressed by the monaural block compression method, i.e., the restoration (decompression) processing including restoring each performance event according to the state, the addition of "note message" and "initial note number" is performed, and then the data of "channel", "note number", "duration", and "gate time" is transferred to the music reproduction section 2 g.

Since the music data is distributed in a compressed state as described above, it is possible to reduce the communication time and communication cost required for downloading the music data. Also, as described above, the music data is decompressed at the time of reproduction, and therefore, the RAM2c of the cellular phone 2 can be configured with a small capacity for storing music data. Alternatively, the compressed music data may be decompressed (to its original size) before being stored in the RAM2 c. In this case, the RAM2c is required to have a certain large capacity, but it is possible to reduce the communication time and communication cost required for downloading music data. Since the decompression and reproduction processing is performed with the program stored in the ROM2b, the program can be configured so that the user can select to decompress music data before reproduction or to sequentially decompress music data during reproduction.

Also, music of the music data can be reproduced while the music data is downloaded. In this case, if the music data is compressed by the monaural block compression method, decompression/reproduction can be started almost at the same time as the reception of the first performance event, and thus, it is suitable for streaming reproduction. It should be noted that music data compressed by the multi-channel block compression method can be reproduced only after the start of the reception of the performance event of the last channel.

In the above-described embodiment, since the difference for the number of initial notes expresses the pitch of the tone, it is possible to reduce the size of data. However, as in the case where the difference from the initial note number cannot be expressed with 5 bits, data compression of the performance event sequence may be performed with the note number that has not been processed yet. Further, the composition of the note number and the number of channels are not limited to the above-described embodiments of the example mode, but may be freely set.

Although in the above-described embodiment, the program for decompression processing is stored in the ROM2b of the cellular phone 2, this is not limitative, and the program may be distributed from the distribution server 1 to the cellular phone 2.

Although in the above-described embodiment, the music data of the incoming call tune is distributed to the cellular phone 2, the present invention may be applied to the case where the music data is distributed to the personal computer 4 compatible with the network, as shown in fig. 5. In this case, the music data can be used for, for example, automatic performance of the personal computer 4 or the electronic musical instrument 5. Furthermore, the present invention can also be used to distribute karaoke music data to a karaoke device or music data used in game software to a game machine.

Although the SMAF format is utilized in the above-described embodiment, the present invention can be applied to any music data containing a performance event sequence formed of note information. Thus, it goes without saying that the present invention can be applied to music data in the general SMF format.

The present invention may be applied to a system composed of a plurality of devices or a single device. It is to be understood that the object of the present invention can also be achieved by supplying a storage medium, in which program codes of software that realizes the functions of the above-described embodiments are stored, to a system or an apparatus, causing a computer (or CPU or MPU) of the system or apparatus to read and execute the program codes stored in the storage medium.

In this case, the program code itself read from the storage medium realizes the functions of the above-described embodiments, and therefore the storage medium storing the program code constitutes the present invention. The storage medium for supplying the program code to the system or apparatus may be, for example, a floppy disk, a hard disk, an optical disk, a magneto-optical disk, a CD-ROM, a CD-R, CD-RW, a DVD-ROM, a DVD-RAM, a DVD-RW, a DVD + RW, a magnetic tape, a nonvolatile memory card, or a ROM. Downloading over a network may also be utilized.

Further, it is to be understood that the functions of the above-described embodiments may be realized not only by executing the program codes executed by the computer, but also by an OS (operating system) or the like operating on the computer performing a part or all of the actual operations according to the instructions of the program codes.

Further, it is to be understood that the functions of the above-described embodiments may be implemented by writing the program code read from the storage medium into a decompression board inserted in a computer or a memory provided in a decompression unit connected to the computer, and then causing a CPU or the like provided in the decompression board to perform part or all of the actual operations in accordance with the instructions of the program code.

Claims

1. A music data compression method comprising the steps of:

receiving music data containing a sequence of pieces of performance event information, each of which is formed of note information; and

each piece of performance event information of music data is converted into another form of performance event information, which contains state information corresponding to a matching or mismatching pattern in note information between one piece of performance event information and a previous piece of performance event information, and note information determined according to the matching or mismatching pattern corresponding to the state information.

2. A music data compression method according to claim 1, wherein said converting step includes a step of expressing tone pitch information included in the note information by a difference from a predetermined initial tone pitch.

3. A music data compression method according to claim 2, wherein the initial pitch of tones is a pitch of tones of any piece of performance event information contained in the music data.

4. The music data compression method according to claim 2, wherein the music data received in said receiving step includes a plurality of performance event information sequences of a plurality of channels, and

wherein the converting step includes arranging a plurality of pieces of performance event information for all channels in time series order, and detecting a match or mismatch in note information between each piece of performance event information arranged for all channels in time series order and a previous piece of performance event information.

5. The music data compression method according to claim 2, wherein the music data received in said receiving step includes a plurality of pieces of performance event information sequences for a plurality of channels, and

wherein the converting step includes arranging a plurality of pieces of performance event information in time series order on a channel-to-channel basis, and detecting a match or mismatch in note information between each piece of performance event information arranged in time series order on a channel-to-channel basis and the previous piece of performance event information.

6. A music data compression method according to claim 4, wherein an initial pitch of tones is set for each channel, and the pitch information of tones of each channel is represented by the difference between each channel and the initial pitch of tones.

7. A music data compression method according to claim 4, wherein the initial tone pitch includes a single tone pitch, and the tone pitch information of each piece of performance event information for all channels arranged in time series order is represented by a difference from the single initial tone pitch.

8. A music data compression method as claimed in claim 1, wherein the note information of each piece of performance event information contains interval information representing an interval with the previous piece of performance event information and utterance duration information of the performance event associated with the pieces of performance event information, and

wherein the converting step includes interval information and utterance duration information expressed by a predetermined note length.

9. A music data compression apparatus comprising:

a reception device that receives music data containing a sequence of pieces of performance event information, each piece of performance event information being formed of note information; and

conversion means for converting a plurality of pieces of performance event information of music data into another form of performance event information including state information corresponding to a matching or mismatching pattern in note information between one piece of performance event information and a previous piece of performance event information and note information determined based on the matching or mismatching pattern corresponding to the state information.