HK1181119B - Creation method, and creation device for characteristic amount of fp - Google Patents
Creation method, and creation device for characteristic amount of fp Download PDFInfo
- Publication number
- HK1181119B HK1181119B HK13108497.6A HK13108497A HK1181119B HK 1181119 B HK1181119 B HK 1181119B HK 13108497 A HK13108497 A HK 13108497A HK 1181119 B HK1181119 B HK 1181119B
- Authority
- HK
- Hong Kong
- Prior art keywords
- peak
- target
- peaks
- pattern
- feature value
- Prior art date
Links
Description
Technical Field
The present invention relates to a method for generating characteristic values of a pattern, a method for generating characteristic values of FP of a multi-component substance for evaluating the quality of the multi-component substance, for example, a Chinese medicine belonging to a multi-component medicine, a program for generating the characteristic values, and a device for generating the characteristic values.
Background
Examples of the multi-component substance include natural drugs such as Chinese medicines belonging to a drug composed of a plurality of components (hereinafter, referred to as a multi-component drug). The quantitative and qualitative profile (profile) of these drugs varies depending on factors such as geological factors, ecological factors, collection time, collection place, collection time, and growth time of the raw crude drug used.
Therefore, certain standards are defined for the quality of these multicomponent drugs and the like to secure safety and effectiveness thereof, and quality evaluations are performed by national regulatory agencies, chemical organization groups, manufacturers, and the like based on the standards.
However, the criteria for determining the quality of a multicomponent drug are generally set by selecting one or more components that are characteristic of the multicomponent drug and setting the selected component or components according to the content thereof.
For example, in non-patent document 1, when the identification of an effective component in a multicomponent drug is impossible, a plurality of components having physical properties such as quantitative analysis availability, high solubility in water, no decomposition in hot water, and no chemical reaction with other components are selected, and the content of these components obtained by chemical analysis is used as a criterion for evaluation.
Further, it is also known to apply chromatography to a multicomponent drug, obtain an ultraviolet-visible absorption spectrum for each retention time, and set a criterion for evaluation from part of the component information.
For example, in patent document 1, a part of peaks in HPLC chromatogram data (hereinafter, referred to as a chromatogram) is selected and a multi-component drug is evaluated by barcoding.
However, in these methods, the evaluation target is limited to "the content of the specific component" or "the chromatogram peak of the specific component", and only a part of the components contained in the multicomponent drug is the evaluation target. Therefore, since a multicomponent drug contains many components other than those to be evaluated, the method for evaluating a multicomponent drug is not accurate enough.
In order to accurately evaluate the quality of a multicomponent drug, it is necessary to evaluate peak information that spans all peak information or all peak information that is close to trivial information with several% being removed.
However, it is difficult to efficiently correspond a plurality of peaks with high accuracy, which hinders highly accurate and efficient evaluation of a multicomponent drug.
Further, even in the case of a multicomponent drug of the same product name, crude drugs as raw materials are natural products, and therefore, the components may slightly differ from each other. Therefore, even for the same quality of medicines, the content ratios of the constituent components may be different or the components present in a certain medicine may not be present in other medicines (hereinafter, referred to as an error between medicines). Further, there are also factors such as the peak intensity of the chromatogram and the time for dissolution of the peak, which do not have strict reproducibility (hereinafter referred to as analysis error). Accordingly, peaks derived from the same component cannot be correlated with respect to all peaks or peaks close to all peaks between multicomponent drugs (hereinafter referred to as peak assignment), and therefore, highly accurate and efficient evaluation is hindered.
Documents of the prior art
Patent document 1: japanese laid-open patent publication No. 2002-214215
Non-patent document 1: journal pharmaceutical affairs vol.28, No.3, 67-71(1986)
Disclosure of Invention
Problems to be solved by the invention
The conventional evaluation method has a problem that it is limited to efficiently evaluate the quality of a multicomponent material with high accuracy.
Means for solving the problems
The present invention is a method for generating a pattern feature value, which is characterized by including a pattern region division feature value generation step of dividing a pattern in which a peak changes in time series into a plurality of regions, and generating a pattern region division feature value by using the existence rate or the existence amount of the peak existing in each region, in order to contribute to improvement of the evaluation accuracy and efficiency.
The present invention provides a method for generating characteristic values of FPs, comprising an FP region division characteristic value generation step of dividing an FP, which is composed of peaks detected from a chromatogram of a multicomponent material and their retention times, into a plurality of regions, and generating FP region division characteristic values from the existence rate or amount of the peaks existing in each region.
The present invention is a pattern feature value creation program for creating a pattern region division feature value by realizing a pattern region division feature value creation function on a computer, dividing a pattern in which peaks change in time series into a plurality of regions, and creating a pattern region division feature value from the existence rate or the existence amount of the peaks existing in each region.
The present invention is an FP eigenvalue generation program including an FP region division eigenvalue generation function of dividing an FP composed of peaks detected from a chromatogram of a multicomponent material and retention times thereof into a plurality of regions, and generating an FP region division eigenvalue from the existence rate or the existence amount of the peaks existing in each region.
The present invention is a pattern feature value generating device, comprising a pattern region division feature value generating section for dividing a pattern in which peaks change in time series into a plurality of regions, and generating a pattern region division feature value from the existence rate or the existence amount of the peaks existing in each region. The present invention is an FP eigenvalue generation device, including an FP region division eigenvalue generation unit that divides an FP composed of peaks detected from a chromatogram of a multicomponent material and retention times thereof into a plurality of regions, and generates an FP region division eigenvalue from the existence rate or the existence amount of the peaks existing in each region.
ADVANTAGEOUS EFFECTS OF INVENTION
The method for generating the characteristic value of the pattern or the FP according to the present invention is configured as described above, and therefore, the characteristic value of the pattern or the FP can be easily obtained by the field division. Therefore, for example, a trivial peak can be grasped to generate a characteristic value.
The characteristic value creation program of the pattern or FP according to the present invention is configured as described above, and therefore, each function can be realized on a computer, and the characteristic value of the pattern or FP can be easily obtained. The characteristic value generating device of the pattern or FP according to the present invention is configured as described above, and therefore, each part can function, and the characteristic value of the pattern or FP can be easily obtained.
Drawings
FIG. 1 is a block diagram of an apparatus for evaluating a multicomponent drug (example 1);
FIG. 2 is a block diagram showing the evaluation sequence of a multicomponent drug (example 1);
FIG. 3 is an explanatory diagram of FPs created from three-dimensional chromatogram data (hereinafter referred to as 3D chromatogram) (example 1);
FIG. 4 shows FP (example 1) in which A is a drug, (B) is a drug B, and (C) is a drug C;
FIG. 5 is a graph showing the holding times of the object FP and the reference FP (example 1);
fig. 6 is a diagram showing a retention time appearance pattern of the object FP (example 1);
FIG. 7 is a diagram showing a retention time appearance pattern of a reference FP (example 1);
fig. 8 is a diagram showing the coincidence number of the holding time appearance distances of the object FP and the reference FP (embodiment 1);
fig. 9 is a diagram showing the degree of coincidence of retention time appearance patterns of the object FP and the reference FP (embodiment 1);
fig. 10 is a diagram showing a home object peak of the object FP (example 1);
FIG. 11 is a peak pattern diagram formed by 3 peaks including a peak of a belonging object (example 1);
FIG. 12 is a peak pattern diagram formed by 5 peaks including a peak of a belonging object (example 1);
fig. 13 is a diagram showing the allowable width of a peak of an object of interest (example 1);
fig. 14 is a diagram showing the attribution candidate peak of the reference FP with respect to the attribution target peak (example 1);
fig. 15 is a peak pattern diagram (example 1) formed by 3 peaks of the belonging target peak and the belonging candidate peak;
fig. 16 is a peak pattern diagram (example 1) formed by 3 peaks of the belonging target peak and the other belonging candidate peaks;
fig. 17 is a peak pattern diagram (example 1) formed by 3 peaks of the belonging target peak and the other belonging candidate peaks;
fig. 18 is a peak pattern diagram (example 1) formed by 3 peaks of the belonging peak and the other belonging candidate peaks;
fig. 19 is a peak pattern diagram (example 1) formed by 5 peaks of the belonging target peak and the belonging candidate peak;
fig. 20 is a peak pattern diagram (example 1) formed by 5 peaks of the belonging target peak and the other belonging candidate peaks;
fig. 21 is a peak pattern diagram (example 1) formed by 5 peaks of the belonging target peak and the other belonging candidate peaks;
fig. 22 is a peak pattern diagram (example 1) formed by 5 peaks of the belonging target peak and the other belonging candidate peaks;
fig. 23 is a diagram showing candidate peaks formed by peak patterns of the belonging target peak and the belonging candidate peak (example 1);
fig. 24 is a diagram showing the total number of peak patterns of the belonging peak when 4 peak pattern formation candidate peaks are present (example 1);
fig. 25 is a diagram showing the total number of peak patterns belonging to candidate peaks when 4 peak pattern formation candidate peaks are present (example 1);
fig. 26 is an explanatory diagram (example 1) showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak;
fig. 27 is an explanatory diagram (example 1) showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak;
fig. 28 is an explanatory diagram showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak (example 1);
fig. 29 is an explanatory diagram (example 1) showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak;
fig. 30 is an explanatory diagram showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak (example 1);
fig. 31 is an explanatory diagram showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak (example 1);
fig. 32 is an explanatory diagram (example 1) showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak;
fig. 33 is an explanatory diagram (example 1) showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak;
fig. 34 is an explanatory diagram (example 1) of a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak;
fig. 35 is an explanatory diagram (example 1) showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak;
fig. 36 is an explanatory diagram (example 1) showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak;
fig. 37 is an explanatory diagram showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak (example 1);
fig. 38 is an explanatory diagram (example 1) showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak;
fig. 39 is an explanatory diagram showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak (example 1);
fig. 40 is an explanatory diagram (example 1) showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak;
fig. 41 is an explanatory diagram (example 1) of a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak;
fig. 42 is an explanatory diagram (example 1) showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak;
fig. 43 is an explanatory diagram showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak (example 1);
fig. 44 is an explanatory diagram (example 1) showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak;
fig. 45 is an explanatory diagram (example 1) showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak;
fig. 46 is an explanatory diagram (example 1) showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak;
fig. 47 is an explanatory diagram (example 1) showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak;
fig. 48 is an explanatory diagram (example 1) showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak;
fig. 49 is an explanatory diagram showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak (example 1);
fig. 50 is an explanatory diagram showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak (example 1);
fig. 51 is an explanatory diagram showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak (example 1);
fig. 52 is an explanatory diagram (example 1) showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak;
fig. 53 is an explanatory diagram showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak (example 1);
fig. 54 is an explanatory diagram showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak (example 1);
fig. 55 is an explanatory diagram showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak (example 1);
fig. 56 is an explanatory diagram showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak (example 1);
fig. 57 is an explanatory diagram showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak (example 1);
fig. 58 is an explanatory diagram showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak (example 1);
fig. 59 is an explanatory diagram showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak (example 1);
fig. 60 is an explanatory diagram showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak (example 1);
fig. 61 is an explanatory diagram showing a network comparison of the peak pattern of the belonging candidate peak with respect to the peak pattern of the belonging target peak (example 1);
fig. 62 is a diagram showing a method of calculating the degree of matching of peak patterns formed by 3 peaks of the belonging target peak and the belonging candidate peak (example 1);
fig. 63 is a diagram showing a method of calculating the degree of matching of peak patterns formed by 3 peaks of the belonging target peak and the belonging candidate peak (example 1);
fig. 64 is a diagram showing a method of calculating the degree of matching of peak patterns formed by 5 peaks of the belonging target peak and the belonging candidate peak (example 1);
fig. 65 is a diagram showing UV spectra of the belonging target peak and the belonging candidate peak (example 1);
fig. 66 is an explanatory diagram of the coincidence degree of the UV spectra of the belonging target peak and the belonging candidate peak (example 1);
fig. 67 is an explanatory diagram (example 1) of calculating the coincidence degree of the belonging candidate peaks from a comparison between the peak pattern and the UV spectrum;
fig. 68 is an explanatory diagram showing attribution of object FPs to reference group FPs (embodiment 1);
fig. 69 is a diagram showing a condition that a subject FP belongs to a reference group FP (embodiment 1);
fig. 70 is an explanatory diagram showing number quantization by region division (example 1);
FIG. 71 is an explanatory view showing a relationship with a variation in retention time or the like (example 1);
FIG. 72 is an explanatory view showing the position of a change area and performing quantization (example 1);
FIG. 73 is a graph showing FP type 2 data (example 1);
FIG. 74 is an explanatory diagram showing a pattern of FP type 2 (example 1);
fig. 75 is an explanatory diagram showing the feature value conversion of each region formed by the region division of the vertical and horizontal division lines (example 1);
FIG. 76 is an explanatory view showing the setting of a vertical dividing line (item 1) (example 1);
FIG. 77 is an explanatory view showing the setting of a transverse dividing line (item 1) (example 1);
FIG. 78 is an explanatory view showing region division by vertical and horizontal dividing lines (example 1);
FIG. 79 is an explanatory diagram showing the number of characteristic-valued regions (example 1);
fig. 80 is a specific explanatory view of the display region 1 (embodiment 1);
FIG. 81 is a graph showing the height and aggregate of all peaks (example 1);
fig. 82 is an explanatory view showing the sum of the peak heights of the region 1 (example 1);
FIG. 83 is a graph showing characteristic values of the entire region (example 1);
fig. 84 is a graph showing characteristic values of respective regions formed by sequentially changing the position of the vertical bar 1 (example 1);
fig. 85 is a graph showing characteristic values of regions formed by sequentially changing the position of the horizontal 1 st bar (example 1);
fig. 86 is a graph showing characteristic values of 1 type alone without changing the position of each of the vertical and horizontal dividing lines (example 1);
fig. 87 is a diagram showing various objects FP and their evaluation values (MD values) (example 1);
fig. 88 is a diagram showing various objects FP and their evaluation values (MD values) (example 1);
fig. 89 is a diagram showing various objects FP and their evaluation values (MD values) (example 1);
fig. 90 is a diagram showing various objects FP and their evaluation values (MD values) (example 1);
fig. 91 is a diagram showing various objects FP and their evaluation values (MD values) (example 1);
FIG. 92 is a process view showing an evaluation method of a multicomponent drug (example 1);
FIG. 93 is a flowchart showing the evaluation of the quality of a multicomponent drug (example 1);
FIG. 94 is a flowchart showing the evaluation of the quality of a multicomponent drug (example 1);
FIG. 95 is a flow chart showing data processing in the FP creation function of a single wavelength (example 1);
FIG. 96 is a flow chart showing data processing in the FP creation function for a plurality of wavelengths (example 1);
FIG. 97 is a flow chart showing data processing in the FP creation function for a plurality of wavelengths (example 1);
fig. 98 is a flowchart showing data processing in peak assignment processing 1 (selection of reference FP) (embodiment 1);
fig. 99 is a flowchart showing data processing in the peak assignment process 2 (calculation of an assignment score) (example 1);
fig. 100 is a flowchart showing data processing in peak attribution processing 3 (specifying of a corresponding peak) (example 1);
fig. 101 is a flowchart showing data processing in the peak attribution processing 4 (attribution to the reference group FP) (embodiment 1);
fig. 102 is a flowchart showing data processing in the peak belonging process 4 (belonging to the reference group FP) (embodiment 1);
fig. 103 is a flowchart showing the coincidence degree calculation processing of the retention time appearance pattern in the peak assignment processing 1 (selection of reference FP) (embodiment 1);
fig. 104 is a flowchart showing the process of calculating the degree of coincidence of UV spectra in the peak assignment process 2 (calculation of an assignment score) (example 1);
fig. 105 is a flowchart showing a matching degree calculation process of peak patterns in the peak assignment process 2 (calculation of an assignment score) (embodiment 1);
FIG. 106 is a flowchart showing details of "FP _ type 2 creation" (embodiment 1);
fig. 107 is a flowchart showing details of "feature valuating processing of the object FP _ type 2 by region segmentation" (embodiment 1);
fig. 108 is a flowchart showing details of "merging of peak feature values and region segmentation feature values of the object FP" (embodiment 1);
FIG. 109 is a flowchart showing the process for creating a reference FP feature merged file (example 1);
FIG. 110 is a flowchart showing the process for creating a reference FP feature merged file (example 1);
fig. 111 is a flowchart showing details of "reference FP attribution result merging processing (creation of a reference FP correspondence table") (embodiment 1);
fig. 112 is a flowchart showing details of "reference FP attribution result merging processing (creation of a reference FP correspondence table") (embodiment 1);
FIG. 113 is a flowchart showing details of "peak feature value processing (creation of reference group FP)" (embodiment 1);
fig. 114 is a flowchart showing details of "creation processing of reference FP type 2" (embodiment 1);
fig. 115 is a flowchart showing details of "feature value processing of a reference FP by region division" (embodiment 1);
fig. 116 is a flowchart showing the correlation of the eigenvalue merging process of the reference FP (example 1);
FIG. 117 is a graph showing an example of data of a 3D chromatogram (example 1);
FIG. 118 is a graph showing an example of data of peak information (example 1);
FIG. 119 is a graph showing a data example of FP (example 1);
fig. 120 is a graph (example 1) showing an example of the result of calculation of the attribution score (determination result file) of the target FP with respect to the reference FP;
fig. 121 is a graph showing the alignment process of peaks corresponding to the subject FP and the reference FP (example 1);
fig. 122 is a graph (example 1) showing an example of the result (comparison result file) of peaks specified in the object FP and the reference FP;
FIG. 123 is a graph showing an example of data of the reference group FP (example 1);
FIG. 124 is a diagram showing an example of a file of FP peak feature values of a subject (example 1);
FIG. 125 is a chart showing an example of data for object and reference FP type 2 (example 1);
FIG. 126 is a diagram showing an example of a file of characteristic values of FP region segmentation of an object (example 1);
FIG. 127 is a diagram showing an example of an object FP merge feature value file (example 1);
FIG. 128 is a graph showing an example of reference type 2 group FP (example 1);
FIG. 129 is a graph showing an example of reference group merged data (example 1);
fig. 130 is a flowchart showing details of a modification of subroutine 2 applied in place of fig. 104 (embodiment 1); and
fig. 131 is a graph showing a calculation example of the moving average and the moving inclination (example 1).
Detailed Description
In order to contribute to improvement in accuracy and efficiency of evaluation, an FP composed of peaks detected from a chromatogram of a multicomponent material and retention time thereof is divided into a plurality of regions, and FP region division characteristic values are created from the existence rate or the existence amount of the peaks existing in each region.
Example 1
Example 1 of the present invention is a method and a program for evaluating a multicomponent drug as a multicomponent substance, for example, a multicomponent drug, and a method, a program and a device for creating characteristic values of FP as a pattern.
The multicomponent drug is defined as a drug containing a plurality of effective chemical components, but is not limited thereto, and includes crude drugs, combinations of crude drugs, extracts of these, traditional Chinese medicines, and the like. The formulation is not particularly limited, and may include, for example, a liquid, an extract, a capsule, a granule, a pill, a suspension, an emulsion, a powder, an alcoholic preparation, a lozenge, an extract decoction, a tincture, a tablet, an aromatic water preparation, a fluid extract, and the like, which are prescribed in the general formulation of the pharmaceutical preparation modified from the 15 th ministry of medicine. The multicomponent material may contain a material other than the drug.
Specific examples of the Chinese herbs are described in the "attention on use" of the prescription 148 of Chinese medicinal preparation for medical use, which is commonly revised and entered through the introduction of the general prescription of Chinese (1978).
In the evaluation of a multicomponent drug, in order to evaluate whether or not an evaluation target drug is equivalent to a plurality of drugs evaluated as normal products, first, information unique to the drug is extracted from three-dimensional chromatogram data (hereinafter, referred to as a 3D chromatogram) of the evaluation target drug to create a target FP.
Next, each peak of the target FP is assigned to peak correspondence data (hereinafter, referred to as a reference group FP) of all reference FPs created by performing peak assignment processing on all reference FPs, and a peak feature value is obtained.
Further, the already assigned peak is removed from the target FP, the FP type 2 is created from the remaining peaks, and the FP type 2 is subjected to region segmentation to obtain a region segmentation feature value.
And combining the two characteristic values to obtain the object FP combined characteristic value.
The equivalence between the reference group FP and the target FP is evaluated by the MT method using the target FP integrated feature value and the reference FP integrated feature values obtained from all the reference FPs. Finally, the obtained evaluation value (hereinafter referred to as MD value) is compared with a preset determination value (upper limit value of MD value), and it is determined whether or not the evaluation target medicine is equivalent to a normal product.
Evaluation device for multicomponent drug
Fig. 1 is a block diagram of an apparatus for evaluating a multi-component drug, fig. 2 is a block diagram showing a procedure for evaluating a multi-component drug, fig. 3 is an explanatory view of FPs created from a 3D chromatogram, fig. 4(a) is an FP of a drug a, (B) is an FP of a drug B, and (C) is an FP of a drug C.
As shown in fig. 1, the evaluation apparatus 1 for a multicomponent drug includes an FP creation unit 3, a target FP peak assignment unit 5, a target FP peak feature value creation unit 7, a target FP type 2 creation unit 9, a target FP region division feature value creation unit 11, a target FP feature value merging unit 13, a reference FP peak assignment unit 15, a reference FP assignment result merging unit 17, a reference FP peak feature value creation unit 19, a reference FP type 2 creation unit 21, a reference FP region division feature value creation unit 23, a reference FP feature value merging unit 25, and an evaluation unit 27. The multi-component medicine evaluation device 1 includes a characteristic value creation device of FP as a pattern.
The FP production unit 3 includes a target FP production unit 29 and a reference FP production unit 31.
The target FP peak assigning unit 5 includes a reference FP selecting unit 33, a peak pattern creating unit 35, and a peak assigning unit 37.
The multi-component medicine evaluation device 1 is constituted by one computer, for example, and includes a CPU, ROM, RAM, and the like, but is not illustrated here. The evaluation apparatus 1 for a multicomponent drug executes a characteristic value creation program for an FP as a characteristic value creation program for a pattern installed on a computer, and can obtain a characteristic value of the FP. However, the characteristic value of the FP may be obtained by reading the evaluation device 1 for a multicomponent drug constituted by a computer using a characteristic value creation program recording medium for the FP on which the program is recorded.
The evaluation device 1 for a multicomponent drug may be configured such that each part is constituted by a separate computer, for example, the target FP peak assigning part 5, the target FP peak feature value creating part 7, the target FP type 2 creating part 9, the target FP region division feature value creating part 11, the target FP feature value merging part 13, and the evaluation part 27 are constituted by one computer, and the reference FP creating part 31, the reference FP peak assigning part 15, the reference FP attribute result merging part 17, the reference FP peak feature value creating part 19, the reference FP type 2 creating part 21, the reference FP region division feature value creating part 23, and the reference FP feature value merging part 25 are constituted by another computer.
In this case, the reference FP integrated feature value is created by another computer and input to the evaluation unit 27 of the evaluation device 1.
In this way, the target FP creation unit 29, the target FP peak attribution unit 5, the target FP region-divided feature value creation unit 7, the target FP type 2 creation unit 9, the target FP region-divided feature value creation unit 11, and the target FP feature value merging unit 13 create target FP integrated feature values, and the reference FP creation unit 31, the reference FP peak attribution unit 15, the reference FP attribution result merging unit 17, the reference FP peak feature value creation unit 19, the reference FP type 2 creation unit 21, the reference FP region-divided feature value creation unit 23, and the reference FP feature value merging unit 25 create reference FP merged feature values, and these are compared and evaluated to evaluate the equivalence of the target FP43 and the reference FP group 45.
The object FP preparation unit 29 of the FP preparation unit 3 is a functional unit as shown in fig. 2 and 3, and prepares an object FP43 (hereinafter, also simply referred to as "FP 43") by extracting a plurality of peaks at a specific detection wavelength, retention time thereof, and UV spectrum from the 3D chromatogram 41, which is three-dimensional chromatogram data of a chromatogram of the chinese medicine 39.
This FP43 is composed of three-dimensional information (peak, retention time, and UV spectrum) as in the 3D chromatogram 41.
Therefore, FP43 receives data unique to the medicine as it is. However, since the data volume is compressed to about 1/70, the amount of information to be processed can be significantly reduced and the processing speed can be increased as compared with the 3D tomogram 41.
The 3D chromatogram 41 is a result of applying high-speed liquid chromatography (HPLC) to the chromatogram 39. The 3D chromatogram 41 represents the moving speed of each component, and is displayed as the moving distance at a specific time, or as a pattern appearing from the end of a column (column) in time series on a graph. In HPLC, the appearance time of a peak is called retention time (retentime) because the detector response is plotted against the time axis.
As the detector, there is no particular limitation, and an absorbance detector (absorbance detector) using optical properties may be used, and the peak is a signal intensity obtained in a three-dimensional state as a detection wavelength corresponding to Ultraviolet (UV). As a detector utilizing optical properties, a transmittance detector (transmittincedetector) may also be used.
The detection wavelength is not limited, but is preferably a plurality of wavelengths selected from the range of 150nm to 900nm, particularly preferably from 200nm to 400nm in the ultraviolet-visible light absorption region, and more preferably from 200nm to 300 nm.
The 3D chromatogram 41 has at least the number (lot number) of the chinese medicine, the retention time, the detection wavelength, and the peak as data.
The 3D chromatogram 41 can be obtained by a commercially available apparatus, for example, Agilent1100 system. The chromatography is not limited to HPLC, and various methods can be used.
As shown in fig. 2 and 3, the 3D chromatogram 41 is displayed with the x-axis as the retention time, the y-axis as the detection wavelength, and the z-axis as the signal intensity.
FP43 has at least the number of the Chinese medicine (lot number), the retention time, the peak of a specific wavelength and UV spectrum as data.
As shown in fig. 2 and 3, FP43 is displayed in a two-dimensional manner with the x-axis as the retention time and the y-axis as the peak of a specific detection wavelength, and is data having UV spectrum information similar to UV spectrum 42 shown as peak 1 for each peak as shown in fig. 3. The specific detection wavelength for creating FP43 is not particularly limited, and can be variously selected. However, it is important for the relay information that all peaks in the FP43 contain the 3D tomogram. Therefore, in example 1, the detection wavelength is set to 203nm including all peaks in the 3D chromatogram.
On the other hand, all peaks may not be included in a single wavelength. In this case, a plurality of detection wavelengths are used, and as will be described later, an FP is formed which combines a plurality of wavelengths and includes all peaks.
In embodiment 1, the peak is set to the maximum value of the signal intensity (peak height), but an area value may be used as the peak. In addition, the FP may be a two-dimensional information in which the x-axis is the retention time and the y-axis is the peak of a specific detection wavelength without containing the UV spectrum. In this case, the FP may be prepared from a 2D chromatogram having the number of the Chinese medicine (batch number) and the retention time as data.
Fig. 4(a) is FP55 of drug a, fig. 4(B) is FP57 of drug B, and fig. 4(C) is FP59 of drug C.
The target FP peak belonging part 5 is a functional part as follows: the peaks of the target FP and the reference FP of the multi-component substance corresponding to the target FP and serving as the evaluation reference are compared, and the corresponding peaks are identified. The target FP peak assigning unit 5 includes a reference FP selecting unit 33, a peak pattern creating unit 35, and a peak assigning unit 37.
The reference FP selection unit 33 is a functional unit that selects an FP of the multi-component substance suitable for peak assignment of the target FP from the plurality of reference FPs. That is, since the peak assignment of each peak of the target FP is performed with high accuracy, the coincidence of the appearance pattern of the retention time of the peak is calculated between the target FP and the reference FP as shown in fig. 5 to 9 (described later), and the reference FP having the smallest coincidence is selected from all the reference FPs.
As shown in fig. 10 to 12 (described later), the peak pattern creating section 35 is a functional section that: with respect to a peak to be attributed to the target FP61 (hereinafter referred to as an attribute peak), a peak pattern composed of a total of n +1 peaks including n peaks present at least one of front and rear in the time axis direction is created as a peak pattern of the attribute peak. n is a natural number.
Fig. 11 (described later) shows a peak pattern including a total of 3 peaks including 2 peaks present at least one of front and rear in the time axis direction, and fig. 12 (described later) shows a peak pattern including a total of 5 peaks including 4 peaks present at least one of front and rear in the time axis direction.
As shown in fig. 13 to 22 (described later), the peak pattern creating unit 35 is a functional unit that creates a peak pattern, which is composed of a total of n +1 peaks including n peaks present at least one of before and after the time axis direction, as a peak pattern of the belonging candidate peak, with respect to all peaks (hereinafter, referred to as belonging candidate peaks) within a range (allowable width) in which a difference from the holding time of the belonging target peak is set in the reference FP 83. Fig. 15 to 18 (described later) show peak patterns each including a total of 3 peaks including 2 peaks existing at least one of front and rear in the time axis direction. Fig. 19 to 22 (described later) show peak patterns each including a total of 5 peaks including 4 peaks existing at least one of front and rear in the time axis direction.
The allowable width is not limited, but is preferably 0.5 to 2 minutes from the viewpoint of accuracy and efficiency. In example 1, the score was 1.
In addition, the peak pattern creating unit 35 can flexibly cope with a difference in the number of peaks between the object FP61 and the reference FP83 (that is, a peak not existing in either one of them). Therefore, as shown in fig. 23 to 61 (described later), among the belonging target peak and the belonging candidate peak (both of them), a peak constituting a peak pattern (hereinafter, referred to as a peak pattern constituting peak) is changed to create a peak pattern comprehensively. Fig. 23 to 61 show peak patterns each including a total of 3 peaks including 2 peaks present at least one of front and rear in the time axis direction.
The peak assigning unit 37 is a functional unit that compares the peak patterns of the target FP and the reference FP and specifies the corresponding peak. In the embodiment, the degree of coincidence between the peak pattern of the belonging peak and the peak pattern of the belonging candidate peak and the degree of coincidence between the UV spectra are calculated, and the corresponding peak is identified.
The function unit calculates the matching degree of the belonging candidate peaks obtained by combining the matching degrees of the both, and belongs each peak of the object FP61 to each peak of the reference FP83 based on the matching degree.
In the peak assigning unit 37, the degree of matching of the peak patterns is calculated from the difference between the corresponding peak and holding time between the peak patterns of the candidate peak to be assigned and the peak pattern of the peak to be assigned, as shown in fig. 62 to 64 (described later). As shown in fig. 65 and 66 (described later), the matching degree of the UV spectrum is calculated from the difference between the absorbance at each wavelength of the UV spectrum 135 of the belonging peak 73 and the absorbance at each wavelength of the UV spectrum 139 of the belonging candidate peak 95. Further, as shown in fig. 67 (described later), the degree of coincidence between the two is multiplied to calculate the degree of coincidence of the candidate peaks 95.
The target FP peak feature value creation unit 7 is a functional unit that: the peak identified and assigned by the target FP peak assignment unit 5 and the peaks of the reference group FP45 belonging to the plurality of reference FPs are compared and evaluated to generate a characteristic value of the target FP peak. The plurality of standard FPs are made corresponding to a plurality of Chinese medicines of multi-component substances serving as evaluation standards, and the plurality of Chinese medicines are normal products.
That is, the target FP peak feature value creation unit 7 is a functional unit that: based on the result of attribution between the target FP61 and the reference FP83, finally, as shown in fig. 2, 68, and 69 (described later), the target FP peak feature value 47 is created by attributing each peak of the target FP43 to each peak of the reference group FP45 and is characterized.
The target FP type 2 creation unit 9 removes the characteristic peaks from the target pattern and creates a pattern including the remaining peaks as the target pattern type 2. For example, the function section is as follows: the peak 47 specified by the target FP peak feature value creation unit 7 is removed from the original target FP43, and an FP composed of the remaining peaks and the remaining holding times thereof is created so that the target FP type 2(49) in fig. 2 is the target pattern type 2. .
The target FP type 2(49) collects the peaks not subjected to the feature value in the target FP peak feature value creation unit 7 to create FPs. By evaluating the characteristic value of the object FP type 2(49), more accurate evaluation can be performed.
The target FP region division feature value creating unit 11 is a functional unit that: the target pattern region division feature value creation unit divides the target pattern type 2 into a plurality of regions and creates a target FP region division feature value as a target pattern region division feature value from the existence rate of the peak existing in each region, and creates a target FP type 2(49) into a plurality of regions and a target pattern region division feature value from the existence rate of the peak existing in each region.
The target FP region division feature value creation unit 11 may use the amount of existence instead of the existence rate. As described later, the existence rate is a value obtained by dividing the existence amount of the peak height of each region by the total peak height (that is, the existence amount of the peak height of the whole). Therefore, the existence amount of the peak height of each region itself may be used to create the region division feature value. As shown in fig. 70 (described later), the target FP region division eigenvalue creation unit 11 divides the target FP type 2(49) into grid-like regions by a plurality of vertical dividing lines parallel to the signal intensity axis and a plurality of horizontal dividing lines parallel to the time axis, thereby creating the target FP region division eigenvalue 51 of fig. 2.
The target FP feature value merging unit 13 is a functional unit that merges the target FP peak feature value 47 created by the target FP peak feature value creation unit 7 and the target FP region divided feature value 51 created by the target FP region divided feature value creation process 11 to create a target FP merged feature value.
On the other hand, the reference FP creation unit 31 of the FP creation unit 3 is a functional unit that creates a plurality of reference FPs, as in the case of the target FP creation unit 29. For example, a reference FP is created by extracting a plurality of peaks at a specific detection wavelength, the retention time thereof, and a UV spectrum for each reference chinese medicine from each 3D chromatogram to which three-dimensional chromatogram data of a plurality of chinese medicines (reference chinese medicines) determined as normal products belongs.
The reference FP peak assigning unit 15 is a functional unit for identifying a peak to be assigned by pattern recognition, as in the target peak assigning unit 5. However, in this reference FP peak assignment unit 15, the peak is specified by sequentially calculating the assignment score for all reference FPs by the selected combination.
The reference FP attribution result merging unit 17 is a functional unit that merges peaks specified and attributed by the reference peak attribution unit 15 to create a reference peak correspondence table (described later).
The reference FP peak feature value creating unit 19 is a functional unit that creates a reference FP peak feature value in which the plurality of reference FPs are characterized, based on the reference peak correspondence table created by the reference FP attribution result merging unit 17.
The reference FP type 2 creating unit 21 is a functional unit having the same function as the target FP type 2 creating unit 9, and removes the peak subjected to the above-described characteristic value from each of the plurality of reference FPs to create an FP composed of the remaining peak and the holding time thereof as the reference FP type 2.
The reference FP region division feature value creation unit 23 is a functional unit having the same function as the target FP region division feature value creation unit 11, and as the FP region division feature value creation unit, divides the reference FP type 2 into a plurality of regions and creates a reference FP region division feature value from the existence rate of the peak existing in each region.
However, the reference FP area division feature value creating unit 23 changes the positions of the divided areas and creates a reference FP area division feature value before and after the change. That is, the positions of the respective regions are changed by changing and setting the positions so that the respective vertical and horizontal dividing lines are moved in parallel within the setting range.
The reference FP feature value merging unit 25 is a functional unit having the same function as the target FP feature value merging unit 13, and merges the reference FP peak feature value and the reference FP region division feature value to create a reference FP integrated feature value.
The evaluation unit 27 compares and evaluates the target pattern merged feature value and a reference pattern merged feature value generated based on a plurality of reference patterns corresponding to the target pattern merged feature value and serving as evaluation criteria. That is, the evaluation unit 27 is a functional unit that compares and evaluates the target FP integrated feature value as the target pattern integrated feature value and the reference FP integrated feature value as the reference pattern integrated feature value. In the examples, the equivalence between the target FP integrated characteristic value and the reference FP integrated characteristic value was evaluated by the MT method.
The MT method is a commonly known calculation method in quality engineering. For example, the description is given in "mathematical theory of quality engineering" published by the Japanese standards Association (2000) at pages 136 and 138, and the quality engineering application lecture "technical development of chemical and pharmaceutical biology" published by the Japanese standards Association (1999) at pages 454 and 456, and quality engineering 11(5), 78-84(2003), and entry MT system (2008).
In addition, general commercially available MT method program software can be used. Commercially available MT method program software includes: ATMTS by Angley (Inc.), TM-ANOVA by Japan standards Association (Profibus), MT method for windows by Ohken (Inc.), and the like.
The evaluation unit 27 assigns an axis of variable of the MT method to one of the lot number, the retention time, and the UV detection wavelength of the chinese medicine of the subject FP43, and sets the peak as a characteristic value in the MT method.
The assignment of the variable axis is not particularly limited, but it is preferable that the holding time is assigned to a so-called item axis of the MT method, the number of the multicomponent drug is assigned to a so-called number row axis, and the peak is assigned to a so-called characteristic value of the MT method.
Here, the item axis and the number row axis are defined as follows. That is, in the MT method, the average mj and the standard deviation σ j are obtained for the data set (dataset) Xij, and the value obtained by normalizing Xij is (Xij-m)j)/σjAnd obtaining a correlation coefficient r between i and j to obtain a unit space or a Mahalanobis distance (Mahalanobis distance). In this case, the item axis and the number row axis are defined as "the average mj and the standard deviation σ j are obtained by changing the value of the number row axis for each value of the item axis".
The MT method is used to obtain the reference point and the unit quantity (hereinafter, simply referred to as "unit space") from the data and the feature value to which the axis is assigned. Here, the reference point, the unit amount, and the unit space are defined according to the description of the MT method.
The MD value is obtained by the MT method as a value indicating a degree of difference from the unit space of the drug to be evaluated. Here, the MD value is defined in the same manner as described in the literature of the MT method, and is obtained by the method described in the literature.
Using the MD values obtained in this manner, the degree of difference between the drug to be evaluated and the plurality of drugs evaluated as normal can be determined.
For example, by performing the attribution processing as described above for each object FP in FIGS. 87 to 91, MD values (MD values: 0.26, 2.20, etc.) can be obtained by the MT method described above.
When the MD value is evaluated with respect to the MD value of a normal product, the MD value is similarly determined for a plurality of medicines evaluated as normal products. By setting a threshold value for the MD value of a normal product, the MD value of the evaluation target drug is plotted as shown by the evaluation result 53 of the evaluation unit 27 in fig. 2, and a normal product or an abnormal product can be determined. In the evaluation result 53 of the evaluation unit 27 in fig. 2, for example, the MD value is 10 or less, and is set as a normal product.
The evaluation unit 27 may be adapted to compare and evaluate the equality between the target FP integrated feature value and the reference FP integrated feature value, and may be adapted to a pattern recognition method other than the MT method.
Action principle of wave crest pattern processing
Fig. 5 to 69 are diagrams for explaining the operation principle of the reference FP selection unit 33, the peak pattern creation unit 35, the peak assignment unit 37, and the target FP peak feature value creation unit 7.
Fig. 5 to 9 are diagrams for explaining the degree of coincidence between the retention time appearance patterns of the target FP and the reference FP with the reference FP selection unit 33. Fig. 5 is a diagram showing the holding times of the object FP and the reference FP, fig. 6 is a diagram showing the appearance pattern of the holding times of the object FP, and fig. 7 is a diagram showing the appearance pattern of the holding times of the reference FP. Fig. 8 is a diagram showing the number of coincidence between the retention time appearance distances of the object FP and the reference FP, and fig. 9 is a diagram showing the degree of coincidence between the retention time appearance patterns of the object FP and the reference FP.
Fig. 5 shows the holding times of the object FP61 and the reference FP 83. Fig. 6 and 7 show retention time appearance patterns in which the distances between all retention times are calculated from the retention times of the object FP61 and the reference FP83, and the distances are collected in a table format. In fig. 8, the number of coincidence between the appearance distances of the retention times is calculated from these appearance patterns, and the number of coincidence between the appearance distances of the retention times, which are integrated into a table, is displayed. In fig. 9, the degree of coincidence of the retention time appearance patterns is calculated based on the number of coincidences, and the degree of coincidence of the retention time appearance patterns, which are collected in a table format, is displayed.
Fig. 10 to 12 are explanatory views of a peak pattern created by the belonging peak and the peripheral peaks of the pattern creating unit 35. Fig. 10 is a diagram showing the belonging target peak of the target FP, fig. 11 is an explanatory diagram of a peak pattern made up of 3 peaks including 2 peripheral peaks, and fig. 12 is an explanatory diagram of a peak pattern made up of 5 peaks including 4 peripheral peaks.
Fig. 13 and 14 are explanatory diagrams of the relationship between the belonging peak and the belonging candidate peak of the peak pattern creating unit 35, fig. 13 is a diagram showing the allowable width of the belonging peak, and fig. 14 is a diagram showing the belonging candidate peak of the reference FP with respect to the belonging peak.
Fig. 15 to 18 are examples of peak maps of the belonging target peak and the belonging candidate peak created by the 3 peaks of the peak pattern creating unit 35. Fig. 15 is a peak pattern diagram formed of 3 peaks of the belonging peak and the belonging candidate peak, fig. 16 is a peak pattern diagram formed of 3 peaks of the belonging peak and the other belonging candidate peaks, fig. 17 is a peak pattern diagram formed of 3 peaks of the belonging peak and the other belonging candidate peaks, and fig. 18 is a peak pattern diagram formed of 3 peaks of the belonging peak and the other belonging candidate peaks.
Fig. 19 to 22 are peak pattern diagrams of the belonging target peak and the belonging candidate peaks created by the 5 peaks of the peak pattern creation unit 35.
Fig. 23 to 61 are diagrams for describing the principle of the comparison in an extensive manner by creating peak patterns of the assignment target peak and the assignment candidate peak of the peak pattern creation unit 35 in an extensive manner.
Fig. 62 and 63 are diagrams illustrating a method of calculating the degree of matching of peak patterns created by the 3 peaks of the peak assigning unit 37.
Fig. 64 is a diagram for explaining a method of calculating the matching degree of the peak patterns created by the 5 peaks of the peak assigning section 37.
Fig. 65 is a diagram showing UV spectra 135 and 139 of the target peak 73 and the candidate peak 95 belonging to the peak belonging part 37.
Fig. 66 is an explanatory diagram of the matching degree between the UV spectrum 135 of the assignment target peak 73 and the UV spectrum 139 of the assignment candidate peak 95 in the peak assignment unit 37.
Fig. 67 is an explanatory diagram of the degree of matching of the belonging candidate peaks calculated from the degree of matching of the peak patterns of the belonging target peak 73, the belonging candidate peak 95, and the degree of matching of the UV spectrum in the peak belonging unit 37.
Fig. 68 is a diagram illustrating attribution of the object FP43 of the peak attribution unit 37 to the reference group FP45 for each peak.
Fig. 69 is an explanatory diagram of the target FP peak feature value 47 showing a state in which each peak of the target FP43 of the peak assigning unit 37 belongs to the reference group FP 45.
Selection of reference FP
The function of the reference FP selection unit 33 will be described further with reference to fig. 5 to 9.
Fig. 5 is a diagram showing the holding times of the object FP and the reference FP, fig. 6 is a diagram showing the appearance pattern of the holding times of the object FP, and fig. 7 is a diagram showing the appearance pattern of the holding times of the reference FP. Fig. 8 is a diagram showing the number of coincidence between the retention time appearance distances of the object FP and the reference FP, and fig. 9 is a diagram showing the degree of coincidence between the retention time appearance patterns of the object FP and the reference FP.
Fig. 5 shows the holding times of the object FP61 and the reference FP 83. Fig. 6 and 7 show retention time appearance patterns in which the distances between all retention times are calculated from the retention times of the object FP61 and the reference FP83, and the distances are collected in a table format. In fig. 8, the number of coincidence between the appearance distances of the retention times is calculated from these appearance patterns, and the number of coincidence between the appearance distances of the retention times, which is obtained by integrating these numbers into a table, is displayed. In fig. 9, the degree of coincidence of the retention time appearance patterns is calculated based on the coincidence number, and the degree of coincidence of the retention time appearance patterns is displayed in a table format.
In the peak attribution processing of the object FP61, each peak of the object FP61 is attributed with a reference FP that is as similar as possible to the FP pattern of the object FP 61. Selecting a reference FP similar to this object FP61 from a plurality of reference FPs is important in attributing high accuracy.
Here, as a method of objectively and easily evaluating the FP pattern similarity with the object FP61, the similarity of the FP pattern is evaluated by keeping the degree of coincidence of the appearance pattern with time.
For example, the holding times of the object FP61 and the reference FP83 are as shown in fig. 5, and the patterns of the holding times of the object FP61 and the reference FP83 appear as shown in fig. 6 and 7. In fig. 6 and 7, the object FP61 and the reference FP83 on the upper layer are patterned in a table format in which the value of each cell is constituted by the pitch of the retention time as shown in the lower layer chart.
In fig. 6, the retention time of each peak (63, 65, 67, 69, 71, 73, 75, 77, 79, 81) of the object FP61 is (10.2), (10.5), (10.8), (11.1), (11.6), (12.1), (12.8), (13.1), (13.6), (14.0).
Therefore, the pitch of the holding time between the peak 63 and the peak 65 is (10.5) - (10.2) ═ 0.3. Similarly, (0.6) between the peaks 63 and 67, and (0.3) between the peaks 65 and 67, etc. Hereinafter, similarly, the object FP to be the lower graph of fig. 6 appears as a pattern.
In fig. 7, the retention time of each peak (85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105) of the reference FP83 is (10.1), (10.4), (10.7), (11.1), (11.7), (12.3), (12.7), (13.1), (13.6), (14.1), and (14.4).
Thus, likewise, the pitch of the hold time appears as a reference FP of the lower graph of fig. 7.
As shown in fig. 6 and 7, the peaks that have been patterned are cyclically compared to find the number of matches. For example, the values of the cells of the target FP appearance pattern in the lower graph in fig. 6 and the values of the cells of the reference FP appearance pattern in the lower graph in fig. 7 are compared to find the number of matches as shown in fig. 8.
That is, the pitches of all the holding times of the holding time appearance patterns of the object FP61 and the reference FP83 are cyclically compared in line units, and the number of distances matching each other is calculated within a set range.
For example, when comparing 1 line of the object and reference FP holding time appearance patterns in fig. 6 and 7, the number of coincidence is 7. These 7 coincidence counts are written in line 1 of the object and reference FP hold time appearance pattern of fig. 8. Similarly, for the other rows in fig. 6 and 7, the matching numbers are obtained by cyclically comparing the number of rows 1 to 9 of the target FP holding time appearance pattern and the number of rows 1 to 10 of the reference FP holding time appearance pattern.
The results are shown in fig. 8. In fig. 8, the numerical value of 7 at the left end surrounded by a circle is the comparison result of the line 1 of the object and reference FP holding time appearance pattern, and the numerical value of the adjacent 7 is the comparison result of the line 1 of the object FP holding time appearance pattern and the line 2 of the reference FP holding time appearance pattern. The range is not limited, but is preferably 0.05 to 0.2 minutes. Example 1 was set to 0.1 min.
Let RP be the degree of coincidence of the retention time appearance pattern, and the degree of coincidence (RP) of the retention time appearance pattern of the f-th row of the object FP61 and the retention time appearance pattern of the g-th row of the reference FP83fg) Using Tanimoto coefficients to
RPfgCalculated as {1- (m/(a + b-m)) } × (a-m + 1).
In the formula, a denotes the number of peaks of the object FP61 (the number of peaks of the object FP), b denotes the number of peaks of the reference FP83 (the number of peaks of the reference FP), and m denotes the number of coincidence of appearance patterns in the retention time (the number of coincidence of appearance distances, see fig. 8). The coincidence degree (RP) of the appearance pattern at each retention time is calculated from the above equation based on the coincidence numbers in fig. 8 (see fig. 9).
The minimum value (RP _ min) of these RPs is set as the degree of coincidence between the retention time appearance pattern of the object FP61 and the reference FP 83. In fig. 9, (0.50) represents the degree of coincidence of the object FP61 with respect to the reference FP.
The matching degrees are calculated for all reference EPs, a reference FP with the smallest matching degree is selected, and the peak assignment of the target FP is performed for the reference FP.
The reference FP selecting unit 5 may pattern the object FP61 and the reference FP83 at a peak height ratio.
The peaks patterned with the peak height ratio are cyclically compared, and the number of height ratio matches is calculated within a set range. Thus, the matching number can be obtained in the same manner as in fig. 8.
In addition, when patterning is performed with the peak height ratio, there are cases where there are a plurality of identical values in 1 line, and this must not be counted repeatedly.
The matching degree is obtained by setting the Tanimoto coefficient to "the matching number of height ratios/(the number of target FP peaks + the number of reference FP peaks-the matching number of height ratios)" so that (1-Tanimoto coefficient) approaches zero.
Further, the (1-Tanimoto coefficient) is weighted (the number of matching peaks of the target FP wave-height ratio +1) to obtain "(1-Tanimoto coefficient) × (the number of peaks of the target FP wave-the number of matching peaks of the appearance distance or the height ratio + 1"), and the reference FP which makes the peaks (63, 65, · · · · · ·.) of the target FP61 more uniform can be selected by the weighting.
Feature valuation of peak patterns
The function of the peak pattern creating section 35 will be described further with reference to fig. 10 to 67.
As shown in fig. 10, when the belonging peak 73 is assigned to any peak of the reference FP83, it becomes a problem as to which peak should be assigned. Even when the peak assignment is performed only on any one of the peak, the retention time, and the UV spectrum, since any one of these three kinds of information contains errors due to the errors between the medicines and the analysis errors, there is a limit to the accuracy of the peak assignment by the individual information.
As shown in fig. 13 and 14, an allowable width of the variation in the holding time is set between the peak to be assigned 73 and each peak of the reference FP83, and the assignment target is determined by integrating all the information of the peak of the reference FP83 (hereinafter referred to as an assignment candidate peak) and the peak assignment of the UV spectrum information existing within the allowable width, so that the accuracy is improved as compared with the peak assignment by the above-mentioned individual information.
However, even if the peak assignment using three types of information is performed, since the UV spectra of similar components are almost the same in terms of the characteristics of the UV spectrum, when the assignment candidate peak contains a plurality of similar components, only the peak information is assigned, and sufficient accuracy cannot be obtained. Therefore, in order to perform peak assignment with higher accuracy, information needs to be added to these three kinds of information.
Here, a peak pattern including information of the peripheral peaks shown in fig. 11 and 12 is created, and peaks are assigned by comparing the peak patterns.
When a peak pattern including peripheral peaks is created, peripheral information is added to the three types of information so far, and peak assignment can be performed using the four types of information, thereby achieving higher assignment accuracy.
As a result, a large number of peaks can be simultaneously assigned with high accuracy and efficiency by a one-time assignment process.
Further, by making the data used for peak assignment four kinds of information including peripheral information, the restriction conditions (peak definition and the like) set at the time of existing peak assignment are not necessary.
In fig. 11, a peak pattern 115 including peaks 71 and 75 on both sides in the time axis direction with respect to the belonging peak 73 is created.
In fig. 12, a peak pattern 125 including peaks 69, 71, 75, and 77 on both sides in the time axis direction with respect to the belonging peak 73 is created.
In fig. 13 and 14, an allowable width of the deviation of the holding time is set between the belonging peak 73 and each peak of the reference FP83, and the peak of the reference FP83 existing within the allowable width is made as a candidate peak (hereinafter referred to as belonging candidate peak) corresponding to the belonging peak 73.
In fig. 15, a peak pattern 117 including peaks 91 and 95 existing on both front and rear sides in the time axis direction with respect to the candidate peak 93 is created as a peak pattern to be compared with the peak pattern 115 of the belonging peak 73.
In fig. 16 to 18, peak patterns 119, 121, and 123 including peaks existing on both the front and rear sides in the time axis direction with respect to the other candidate peaks 95, 97, and 99 are created as peak patterns to be compared with the peak pattern 115 of the belonging peak 73.
When comparing the peak patterns with higher accuracy, it is necessary to create a peak pattern in which the number of peripheral peaks is increased in both the target FP and the reference FP, as shown in fig. 19 to 22.
For example, when a comparison of peak patterns formed by a total of 5 peaks including 4 peaks in the periphery is made, higher attribution accuracy can be obtained.
In fig. 19, a peak pattern 127 including peaks 89, 91, 95, and 97 present on both sides in the time axis direction with respect to the candidate peak 93 is created as a peak pattern to be compared with the peak pattern 125 of the belonging peak 73.
In fig. 20 to 22, peak patterns 129, 131, and 133 including peaks existing on both the front and rear sides in the time axis direction with respect to the other candidate peaks 95, 97, and 99 are created as peak patterns compared with the peak pattern 125 of the belonging peak 73.
When attribution using this peak pattern is performed with higher accuracy, it is necessary to cope with a case where the number of peaks of the target FP and the reference FP is different (that is, there is a case where there is no peak at either one). Therefore, as shown in fig. 23 to 25, it is necessary to create a peak pattern in which peak pattern constituent peaks change comprehensively, in both the belonging target peak and the belonging candidate peak.
Specifically, a peak as a candidate of a peak pattern constituting peak (hereinafter, referred to as a peak pattern constituting candidate peak) is set in advance from among the peripheral peaks of the target peak belonging to the target FP, and this peak pattern constituting candidate peak is sequentially made as a peak pattern constituting peak to make a peak pattern. Similarly, a peak pattern constituting candidate peak is set for the belonging candidate peak of the reference FP, and the peak pattern constituting candidate peaks are sequentially made as peak pattern constituting peaks to make a peak pattern.
For example, as shown in fig. 23, 4 neighboring peaks in the time axis direction (69, 71, 75, 77) are set as peak pattern configuration candidate peaks of the belonging target peak 73, 4 neighboring peaks in the time axis direction (89, 91, 95, 97) are set as peak pattern configuration candidate peaks of the belonging candidate peak 93, and the peak pattern configuration peaks are set to 2 arbitrary peaks, respectively. At this time, as shown in fig. 24 and 25, a peak pattern of 4C2 (6) is created for each of the belonging target peak 73 and the belonging candidate peak 93.
When 10 peak pattern formation candidate peaks are set and arbitrary 2 peak pattern formation peaks are set, peak patterns of 10C2 (45) patterns can be created for each of the belonging target peak and the belonging candidate peak. When the peak pattern constituting peaks are set to any 4, peak patterns of 10C4(═ 210) patterns can be created for each of the belonging target peaks and the belonging candidate peaks.
The function of the peak assigning unit 37 will be described further with reference to fig. 26 to 67.
The peak assigning unit 37 calculates the degree of matching of the peak pattern (hereinafter referred to as P _ Sim) between all the peak patterns of the assignment target peak and the assignment candidate peaks created by the peak pattern creating unit 35, based on the difference between the corresponding peak and the holding time. The peak assigning unit 37 sets the minimum value of P _ Sim (hereinafter referred to as P _ Sim _ min) to the degree of matching between the peak patterns of the assignment target peak and the assignment candidate peak.
For example, as shown in fig. 26 to 61, the peak pattern formation candidate peaks are set to 4 in the front and rear peripheries in the time axis direction and the peak pattern formation peaks are set to 2 arbitrarily in each of the belonging target peak 73 and the belonging candidate peak 93. In this setting, a peak pattern of a 4C2(═ 6) pattern is created for each of the belonging target peak and the belonging candidate peak. Therefore, the P _ Sim of the belonging peak 73 and the belonging candidate peak 93 is calculated as shown in a 6-pattern × 6 pattern (36), and P _ Sim _ min, which is the minimum value of these P _ sims, is set as the degree of coincidence between the belonging peak 73 and the belonging candidate peak 93.
In each of the belonging peak 73 and the belonging candidate peak 93, 10 peak pattern constituting candidate peaks are provided in the front and rear periphery in the time axis direction, 2 peak pattern constituting peaks are provided arbitrarily, and a peak pattern of a 10C2 (45) pattern is formed in each of the belonging peak and the belonging candidate peak. Therefore, the P _ Sim of the belonging peak 73 and the belonging candidate peak 93 is calculated as shown in a 45 pattern × 45 pattern (2025), and P _ Sim _ min, which is the minimum value of these P _ sims, is set as the degree of coincidence between the belonging peak 73 and the belonging candidate peak 93. When 4 arbitrary peak pattern constituting peaks are set, peak patterns of 10C4 (210) patterns are created for each of the belonging target peak and the belonging candidate peak. Therefore, the P _ Sim of the belonging peak 73 and the belonging candidate peak 93 is calculated as shown by a 210 pattern × 210 pattern (44100), and P _ Sim _ min, which is the minimum value of these P _ sims, is set as the degree of coincidence between the belonging peak 73 and the belonging candidate peak 93.
This P _ Sim is calculated similarly for all the candidate peaks to be assigned to the peak 73.
Fig. 62 and 63 illustrate a method for calculating the degree of matching of peak patterns by comparing peak patterns composed of 3 peaks. In this case, the peak pattern 115 of the belonging target peak 73 and the peak pattern 119 of the belonging candidate peak 95 are taken as examples.
In the peak pattern 115 of the belonging target peak 73, the peak and the holding time of the belonging target peak 73 are p1 and r1, the peak and the holding time of the peak pattern constituting peak 71 are dn1 and cn1, and the peak and the holding time of the peak pattern constituting peak 75 are dn2 and cn 2.
In the peak pattern 119 of the belonging candidate peak 95, the peak and the holding time of the belonging candidate peak 95 are p2 and r2, the peak and the holding time of the peak pattern constituting peak 93 are fn1 and en1, and the peak and the holding time of the peak pattern constituting peak 97 are fn2 and en 2.
When the peak pattern matching degree is P _ Sim, the matching degree of the peak pattern composed of 3 peaks of the belonging candidate peak 95 and the belonging target peak 73 (P _ Sim (73-95)) is set to P _ Sim, and
P_Sim(73-95)=(|p1-p2|+1)×(|(r1-(r2+d)|+1)
+(|dn1-fn1|+1)×(|(cn1-r1)-(en1-r2)|+1)
and + (| dn2-fn2| +1) × (| (cn2-r1) - (en2-r2) | + 1).
In the formula, d is a value for correcting the variation in the holding time.
Fig. 64 illustrates a method for calculating the degree of matching of peak patterns by comparing peak patterns consisting of 5 peaks. In this case, the peak pattern 125 of the belonging peak 73 and the peak pattern 129 of the belonging candidate peak 95 are taken as examples.
In the peak pattern 125 of the belonging target peak 73, the peaks and holding times of the belonging target peak 73 are p1 and r1, and the peaks and holding times of the peak pattern constituting peaks 69, 71, 75, and 77 are dn1 and cn1, dn2 and cn2, dn3 and cn3, dn4 and cn4, respectively.
In the peak pattern 129 of the belonging candidate peak 95, the peaks and holding times of the belonging candidate peaks 95 are set to p2 and r2, and the peaks and holding times of the peak pattern constituting peaks 91, 93, 97, 99 are set to fn1 and en1, fn2 and en2, fn3 and en3, fn4 and en4, respectively.
The degree of coincidence (P _ Sim (73-95)) of the peak pattern formed by the 5 peaks of the belonging peak 73 and the belonging candidate peak 95, and the degree of coincidence
P_Sim(73-95)=(|p1-p2|+1)×(|(r1-(r2+d)|+1)
+(|dn1-fn1|+1)×(|(cn1-r1)-(en1-r2)|+1)
+(|dn2-fn2|+1)×(|(cn2-r1)-(en2-r2)|+1)
+(|dn3-fn3|+1)×(|(cn3-r1)-(en3-r2)|+1)
And + (| dn4-fn4| +1) × (| (cn4-r1) - (en4-r2) | + 1).
In the formula, d is a value for correcting the variation of the holding time.
In the peak assigning unit 37, as shown in fig. 67 and 68, the matching degree of the UV spectrum is calculated between the assignment target peak and the assignment candidate peak.
FIG. 65 is a diagram showing UV spectrums (135 and 139) of the subject peak 73 and the candidate peak 95, and the coincidence degree (UV _ Sim (73-95)) between these two UV spectrums is shown in FIG. 66
UV _ Sim (73-95) was calculated as RMSD (135vs 139).
RMSD is a root mean square deviation defined as the sum of the squared average of the corresponding 2-point distances (dis) squared respectively. That is to say, the first and second electrodes,
in RMSD ═ v { ∑ dis2The equation of/n.
n is the number dis.
Here, the waveform of the UV spectrum includes a maximum wavelength and a minimum wavelength, and the degree of coincidence may be calculated by comparing the maximum wavelength and the minimum wavelength or any one of them. However, although the maximum wavelength and the minimum wavelength are the same in a compound having no absorption property or a compound having similar absorption property, the waveforms as a whole may be greatly different, and there is a possibility that the degree of coincidence of the waveforms cannot be calculated in comparison of the maximum wavelength and the minimum wavelength.
In contrast, when RMSD is used with a waveform of a UV spectrum, the waveform of the UV spectrum can be compared with the entire waveform, so that the degree of coincidence can be calculated more accurately, and the waveform can be identified accurately with a compound having no absorption characteristics or a compound having similar absorption characteristics.
The matching degree of the UV spectrum is calculated similarly in accordance with the method of assigning all candidate peaks of the assignment target peak 73.
Further, the peak assigning unit 37 calculates the matching degree of the candidate peaks to be assigned by combining the matching degrees of the both as shown in fig. 67.
As shown in fig. 67, the coincidence degree of the belonging candidate peaks (SCORE (73-95)) is calculated by multiplying the coincidence degrees of the peak patterns and the UV spectrum. The score showing the degree of coincidence of the peak patterns 73, 95 is P _ Sim _ min (73-95), and the score showing the degree of coincidence of the corresponding UV waveform data 135, 111 is UV _ Sim (73-95). At this time, the coincidence degree SCORE (73-95) of the candidate peaks is assigned
SCORE (73-95) ═ P _ Sim _ min (73-95) × UV _ Sim (73-95) was calculated.
The matching degree of the belonging candidate peaks is calculated in the same manner as in the manner of attributing all the candidate peaks to the belonging object 73.
Then, the SCORE is compared between all the candidate peaks to determine the candidate peak to which the SCORE is the smallest as the belonging peak of the belonging target peak 73.
The peak assignment unit 37 determines the peak to be assigned of the target peak by combining two viewpoints, and thus can assign the peak accurately.
In addition, in the target FP peak feature value generation unit 7, each peak of the target FP43 is assigned to the reference group FP45 as shown in fig. 68, based on the assignment result of the target FP to the reference FP.
The peaks of the object FP43 are attributed to the reference FP constituting the reference group FP45 by the foregoing attribution processing. According to the attribution result, the peak is finally attributed to the reference group FP 45.
The reference group FP45 is prepared by assigning all of the plurality of reference FPs rated as normal products in the above-described manner, and each peak is represented by the average value (black dot) ± standard deviation (vertical dividing line) of the assigned peaks.
Fig. 69 is a result of attributing the subject FP43 to the reference group FP45, which is the subject FP peak characteristic value 47 of the subject FP 43.
Operating principle for creating FP region division characteristic value
Fig. 70 to 86 show the principle of operation of FP region division feature value creation, fig. 70 is an explanatory diagram showing the number quantization by region division, fig. 71 is an explanatory diagram showing the relationship with the variation of retention time and the like, fig. 72 is an explanatory diagram showing the position of a region being changed and the number quantization being added, fig. 73 is a graph showing the data of FP type 2, fig. 74 is an explanatory diagram showing the pattern of FP type 2, fig. 75 is an explanatory diagram showing the feature value of each region formed by region division by vertical and horizontal division lines, fig. 76 is an explanatory diagram showing the setting of vertical division line (1 st bar), fig. 77 is an explanatory diagram showing the setting of horizontal division line (1 st bar), fig. 78 is an explanatory diagram showing the region division by vertical and horizontal division lines, fig. 79 is an explanatory diagram showing the number of region being featured, fig. 80 is a specific explanatory diagram showing region 1, fig. 81 is a graph showing the heights and the total of all peaks, fig. 82 is an explanatory graph showing the total of the peak heights of the region 1, fig. 83 is a graph showing the feature values of the entire region formed by the first 1 pattern, fig. 84 is a graph showing the feature values of the regions formed by sequentially changing the position of the vertical 1-th bar, fig. 85 is a graph showing the feature values of the regions formed by sequentially changing the position of the horizontal 1-th bar, and fig. 86 is a graph showing the feature values of the individual 1 type without changing the positions of the vertical and horizontal dividing lines.
The target FP region division feature value creation unit 11 or the reference FP region division feature value creation unit 23 creates the target FP region division feature value or the reference FP region division feature value from the existence rate of the peak existing in each region into which the target FP type 2 or the reference FP type 2 is divided, as described above.
The division of the region is performed, for example, in the manner shown in fig. 70. In fig. 70, for example, FP55 of drug a is segmented. The signal is divided by a plurality of vertical dividing lines 141 parallel to the signal intensity axis and a plurality of horizontal dividing lines 143 parallel to the time axis to form a plurality of lattices 145 as a plurality of regions.
In the present embodiment, the plurality of horizontal dividing lines 143 are set at equal intervals in a direction in which the signal intensity increases. By setting as above, the region of the portion where the peaks are dense is divided into fine segments, and the existence rate of the peaks can be grasped more accurately. However, the number of the plurality of transverse dividing lines 143 may be increased to set the plurality of transverse dividing lines at the uniform intervals.
The peak height in each grid 145 is quantified by the ratio and set as a characteristic value
On the other hand, as shown in fig. 71, the holding time or the peak height fluctuates as shown in FP55A, 55B due to slight variation or the like of the analysis conditions. This variation may cause a large variation in the value in each cell 145.
Case of reference FP type 2:
then, in the case of the reference FP type 2, as shown in fig. 72, the position of each grid 145 is changed (shifted), and the number is quantized before and after the change. By such an operation, the reference FP region division feature value can be accurately created. The position of each of the lattices 145 can be changed by changing and setting the position so that the vertical and horizontal dividing lines 141, 143 move in parallel within a set range. Here, the quantization of the number of changes in the position of each grid 145 will be further described.
Fig. 73 shows data d202, d207, and d208 of the reference FP type 2 as an example. This data is composed of information of only the Retention Time (RT) and the peak Height (Height). This data is obtained by removing each UV spectrum of all peaks in the reference FP type 2 corresponding to the reference FP type 2, and the reference FP type 2 is composed of the peaks remaining after removing the characteristic peaks from the plurality of reference FPs and the retention time thereof.
The pattern of data d202, d207, d208 of the reference FP type 2 is shown in fig. 74.
These FP patterns are divided into regions by vertical and horizontal dividing lines 141 and 143, and feature value conversion is performed for each region.
Setting of vertical dividing line (item 1):
in order to set the position of the vertical dividing line (bar 1), the holding time (RT), amplitude, and pitch of the bar 1 are specified as shown in fig. 76.
Based on these three parameters, the position of the vertical bar 1 is set in a plurality of places under the following conditions.
Longitudinal dividing line (1 st line) RT-amplitude + (amplitude × 2/pitch) × i
(i ═ 0, 1, 2, · pitch-1)
For example, when RT is 1, amplitude is 1, and the number of pitches is 10,
can set
The vertical dividing line (item 1) is 0.0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, and 1.8.
Setting of transverse dividing line (item 1):
to set the position of the horizontal (bar 1), the height, amplitude, pitch of the bar 1 are specified as shown in fig. 77.
Based on these three parameters, the position of the cross bar 1 is set in a plurality of places under the following conditions.
For example, when the height is 1, the amplitude is 0.5, and the pitch count is 10,
horizontal dividing line (1 st line) is height-amplitude + (amplitude x 2/pitch) x i
(i ═ 0, 1, 2, · pitch-1)
For example, when the height is 1, the amplitude is 0.5, and the pitch count is 10,
can set
The transverse division line (item 1) is 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4.
Combination of longitudinal and transverse dividing lines (item 1):
the section is divided by setting the sample line (background) after the 2 nd section by all combinations of the set vertical and horizontal dividing lines (the 1 st section).
Examples of the foregoing are as follows.
Longitudinal division line (1 st) x transverse division line (1 st)
(0.0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8) × (0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4) ═ 100 kinds
The 2 nd and subsequent dividing lines are set in sequence by the 100 kinds of all combinations, and the regions are divided.
Setting of longitudinal and transverse dividing lines (items 2 and later):
the number of the vertical dividing lines 2 and thereafter is set to the number specified at the specified interval (equal difference).
The ith longitudinal dividing line (i-1) th longitudinal dividing line + spacing
(i. 2,. cndot. cndot.)
The number of the horizontal dividing lines 2 and thereafter is set to a number specified at a specified interval (equal ratio).
Transverse dividing line ith (i-1) + interval × 2 ÷ (i-2)
(i. 2,. cndot. cndot.)
For example, when the longitudinal 1 st line is 0.0, the longitudinal interval is 10, the longitudinal number is 7, the lateral dividing line 1 st line is 0.5, the lateral interval is 1, and the lateral number is 6,
is set as
Longitudinal dividing line is 0, 10, 20, 30, 40, 50, 60
The transverse division line is 0.5, 1.5, 3.5, 7.5, 15.5 and 31.5.
Area division by dividing lines:
in the above example, the set vertical and horizontal dividing lines are marked on FP as shown in fig. 78.
The FP feature is quantized according to the area surrounded by the vertical line and the horizontal line.
Since there are 30 regions in total, 30 eigenvalues can be obtained as shown in fig. 79.
And (3) carrying out characteristic value conversion according to each region:
each region is characterized according to the following equation.
The feature value is the sum of the peak heights in the region/the sum of all the peak heights
The characteristic value method comprises the following steps:
next, the feature value of region 1 of d202 shown in fig. 80 is obtained from the above equation.
First, the total height of all peaks is calculated to be 15.545472 as shown in fig. 81.
Next, the peak heights of the regions 1 are summed up to calculate the peak heights as shown in fig. 82.
Thus, the characteristic value of region 1 is
The characteristic value is 2/15.545472-0.128655.
Characteristic value of the whole area:
according to the above-described feature value determination method, the feature value of the entire region formed by the first 1 pattern is calculated. The calculation results are shown in fig. 83.
Sequentially changing the 1 st longitudinal dividing line for characteristic value:
in the above method, the feature value of each region formed by sequentially changing the position of the 1 st line of the vertical dividing line is changed. The results are shown in FIG. 84.
Change the horizontal bar 1 in order to characterize:
the position of the 1 st line of the transverse dividing line is changed every time, namely, the 1 st line of the longitudinal dividing line is changed. The characteristics of each region that can be formed are valued in the above manner. The results are shown in FIG. 85.
By this processing, in the case where there are 10 places in the 1 st line of the vertical and horizontal dividing lines, the processing is performed
100 rows (100 kinds) × 31 rows (file name +30 eigenvalues).
Feature valuation of full baseline data (baseline type 2 group FP):
the processing up to now is carried out with the full reference data. For example, in the case where the reference data is 3 data of d202, d207, and d208, the reference data is
300 rows (100 kinds × 3 data) × 31 rows (file name +30 eigenvalues).
Case of object FP type 2:
in the object FP type 2, the combination of the vertical and horizontal dividing lines (item 1) is
1 type (vertical (RT) 1 and horizontal (height) 1), and therefore, the characteristic values of the 1 type are calculated.
Value of MD
As described above, fig. 87 to 91 are diagrams showing various objects FP and their evaluation values (MD values) by the evaluation unit 27, and the evaluation unit 27 can obtain MD values (MD values: 0.26, 2.20, etc.) by performing the attribution processing on the respective objects FP as described above.
Method for evaluating multicomponent drug
Fig. 92 is a process view showing the method of evaluating a multicomponent drug of example 1 of the present invention as a method of evaluating a pattern.
As shown in fig. 92, the evaluation method of the multicomponent drug includes an FP creation process 148, a target FP peak attribution process 149, a target FP peak feature value creation process 151, a target FP type 2 creation process 153, a target FP region division feature value creation process 155, a target FP feature value merging process 157, a reference FP peak attribution process 159, a reference FP attribution result merging process 161, a reference FP peak feature value creation process 163, a reference FP type 2 creation process 165, a reference FP region division feature value creation process 167, a reference FP feature value merging process 169, and an evaluation process 171.
The FP production process 148 includes a target FP production process 173 and a reference FP production process 175.
The target FP peak attribution process 149 has a reference FP selection process 177, a peak pattern making process 179, and a peak attribution process 181.
These FP creation process 148, object FP peak attribution process 149, object FP peak feature value creation process 151, object FP type 2 creation process 153, object FP region division feature value creation process 155, object FP feature value merging process 157, reference FP peak attribution process 159, reference FP attribution result merging process 161, reference FP peak feature value creation process 163, reference FP type 2 creation process 165, reference FP region division feature value creation process 167, reference FP feature value merging process 169, and evaluation process 171 are performed by using the above-described evaluation device 1 for a multicomponent drug in the present embodiment.
The FP creation process 148 is performed by the function of the FP creation unit 3 in fig. 1, and similarly, the target FP peak attribution process 149, the target FP peak feature value creation process 151, the target FP type 2 creation process 153, the target FP region division feature value creation process 155, the target FP feature value merging process 157, the reference FP peak attribution process 159, the reference FP attribute result merging process 161, the reference FP peak feature value creation process 163, the reference FP type 2 creation process 165, the reference FP region division feature value creation process 167, the reference FP feature value merging process 169, and the evaluation process 171 pass through the target FP peak attribution unit 5, the target FP peak feature value creation unit 7, the target FP type 2 creation unit 9, the target FP region division feature value creation unit 11, the target FP feature value merging unit 13, the reference FP peak attribution unit 15, the reference FP attribute result merging unit 17, the target FP peak feature value merging unit 17, and the like, The reference FP peak feature value creating unit 19, the reference FP type 2 creating unit 21, the reference FP region division feature value creating unit 23, the reference FP feature value merging unit 25, and the evaluation unit 27.
The functions of the respective processes may be performed by separate computers, for example, the target FP creation process 173, the target FP peak assignment process 149, the target FP peak feature value creation process 151, the target FP type 2 creation process 153, the target FP region division feature value creation process 155, the target FP feature value merging process 157, and the evaluation process 171 may be performed by one computer, and the reference FP creation process 175, the reference FP peak assignment process 159, the reference FP assignment result merging process 161, the reference FP peak feature value creation process 163, the reference FP type 2 creation process 165, the reference FP region division feature value creation process 167, and the reference FP feature value merging process 169 may be performed by another computer.
At this time, a reference FP integrated feature value is created by another computer and supplied to the evaluation process 171.
In this way, the object FP type 2 making-out process 153 makes out the object FP type 2 as a pattern in which the peak changes in time series. The target FP-region division feature value creating process 155 forms an FP-region division feature value creating process as a target pattern region division feature value creating process of dividing the target pattern type 2 into a plurality of regions and creating a target pattern region division feature value from the existence rate of peaks existing in each region.
Evaluation procedure for multicomponent drug
Fig. 93 to 108 are flowcharts relating to an evaluation program of a multicomponent drug, fig. 109 to 116 are flowcharts relating to creation of reference data, fig. 117 are diagrams showing a data example of a 3D chromatogram, fig. 118 is a diagram showing a data example of peak information, fig. 119 is a diagram showing a data example of FP, fig. 120 is a diagram showing an example of a result of assignment calculation (determination result file) of a target FP to a reference FP, fig. 121 is a diagram showing an example of two intermediate files (an assignment candidate peak score table, an assignment candidate peak number table) created in a comparison process of a result of specifying a peak corresponding to the target FP and the reference FP, fig. 122 is a diagram showing a case of a comparison result of a result of specifying a peak corresponding to the target FP and the reference FP, fig. 123 is a diagram showing a data example of a reference group FP, fig. 124 is a diagram showing a level of characteristic value data of a peak of a target FP belonging to a reference FP group, fig. 125 is a graph showing an example of data of the object and the reference FP type 2, fig. 126 is a graph showing a case of the object FP region segmentation feature value profile, fig. 127 is a graph showing a case of the object FP feature value merge profile, fig. 128 is a graph showing an example of the reference type 2 group FP, and fig. 129 is a graph showing an example of the reference group merge data.
Fig. 93 and 94 are flowcharts showing the overall procedure of the process for evaluating the evaluation target medicine, and the system is started by starting the system, and the FP creation function of the FP creation unit 3, the target FP peak assignment function of the target FP peak assignment unit 5, the target FP peak feature value creation function of the target FP peak feature value creation unit 7, the target FP type 2 creation function of the target FP type 2 creation unit 9, the target FP region division feature value creation function of the target FP region division feature value creation unit 11, the target FP feature value merging function of the target FP feature value merging unit 13, the reference FP peak assignment function of the reference FP peak assignment unit 15, the reference FP assignment result merging function of the reference FP result merging unit 17, the reference FP feature value creation function of the reference FP peak feature value creation unit 19, and the reference FP type 2 function of the reference FP type 2 unit 21 are realized on the computer, A reference FP region division feature value creation function of the reference FP region division feature value creation section 23, a reference FP feature value combination function of the reference FP feature value combination section 25, and an evaluation function of the evaluation section 27.
The FP creation function is realized by step S1. The object FP peak attribution function is realized by steps S2, S3, S4. The target FP peak feature value creation function is realized in step S5. The object FP type 2 creation function is realized by step S6. The target FP region division feature value creating function is realized in step S7. The object FP feature value merging function is realized by step S8. The evaluation function is realized by steps S9, S10.
In step S1, "FP creation processing" is executed using the 3D chromatogram and the peak information of the specific detection wavelength as input data.
The 3D chromatogram is data obtained by analyzing the evaluation target drug by HPLC, and is data composed of three-dimensional information of retention time, detection wavelength, and peak (signal intensity), as shown in a data example 183 of the 3D chromatogram of fig. 117. The peak information is obtained by processing chromatogram data at a specific wavelength obtained by the HPLC analysis by an HPLC data analysis tool (for example, ChemStation), and is data composed of a maximum value and an area value of all peaks detected as peaks, a retention time at that time point, and the like, as shown in a peak information example 185 of fig. 118.
In step S1, the object creation unit 29 (fig. 1) of the FP creation unit 3 of the computer functions to create the object FP43 (fig. 2) from the 3D tomogram and the peak information, and outputs the data as an archive. As shown in data example 187 of FP in fig. 119, this object FP43 is data composed of a retention time, a peak height, and a UV spectrum for each peak height.
In step S2, "target FP attribution processing 1" is executed with the target FP and all the reference FPs output in step S1 as inputs.
In step S2, the reference FP selection unit 33 of the computer functions to calculate the degree of matching between the retention time appearance pattern of the target FP43 and all the reference FPs, and to select a reference FP suitable for attribution of the target FP 43.
The reference FP is an FP created from the 3D chromatogram of the drug evaluated as a normal product and the peak information by the same processing as in step S1. In addition, the normal product is defined as a drug with confirmed safety and effectiveness, and belongs to a plurality of drugs with different product batches. The reference FP is also data configured in the same manner as the FP data example 187 in fig. 119.
In step S3, "subject FP attribution processing 2" is executed with the reference FP selected in the subject FP43 and step S2 as an input.
In step S3, the peak pattern creating unit 35 (fig. 1) and the peak assigning unit 37 (fig. 1) of the computer function. With this function, peak patterns are created in a linear manner as shown in fig. 23 to 61 for all the peaks of the object FP43 and the reference FP selected in step S2, and then the degree of coincidence between these peak patterns is calculated (P _ Sim in fig. 63 or 64). Further, the matching degree of the UV spectrum is calculated between the peaks of the object FP and the reference FP (UV _ Sim in fig. 66). Further, the matching degree of the belonging candidate peak is calculated from these two matching degrees (SCORE of fig. 67). The calculation result is outputted to the same file as the determination result file 189 of fig. 120.
Step S4 executes "object FP attribution processing 3" with the determination result archive 189 output at step S3 as input.
In step S4, the peak assignment unit 37 of the computer functions to specify the peak of the reference FP corresponding to each peak of the target FP between the target FP43 and the reference FP based on the degree of agreement (SCORE) of the candidate peaks assigned. The result is outputted to the same comparison result file as the comparison result file 195 of fig. 122.
Step S5 receives the comparison result file output in step S4 and the reference group FP197 as input, and executes the "target FP assignment process 4".
The reference group FP197 is peak correspondence data between all reference FPs created by the same processing as in steps S2 to S4.
In step S5, the target FP peak feature value creation unit 7 of the computer functions to assign each peak of the target FP43 to the peak of the reference group FP197 as shown in fig. 68 and 69 based on the comparison result file of the target FP 43. The result is output to the same file as the file example 199 of the peak data characteristic value of fig. 124.
Step S6 receives the peak data feature value file output in step S5 and the target FP, and executes a process of "creation of FP _ type 2".
In step S6, the target FP type 2 generation unit 9 of the computer functions to remove the peak 47 specified by the target FP peak feature value generation unit 7 from the original target FP43, and to generate an FP composed of the remaining peaks and the holding time thereof as the target FP type 2 (49). The result is output to the FP type 2 file (see FP type 2 file example 201 in fig. 125).
In step S7, "feature value processing of the object FP _ type 2 by region segmentation" is performed. In this processing, the target FP region division feature value creation unit 11 of the computer functions to create a target FP region division feature value by the region division in fig. 70. The result is output to the target FP region division feature value file (see target FP region division feature value file example 203 in fig. 126).
In step S8, a process of "merging of peak data feature values and region division feature values" is performed.
In this processing, the target FP feature value merging unit 13 of the computer functions to merge the target FP peak feature value 47 created by the target FP peak feature value creation unit 7 with the target FP region divided feature value 51 created by the target FP region divided feature value creation unit 11 to create a target FP merged feature value. The result is output to the target FP eigenvalue merged file (refer to the target FP eigenvalue merged file case 205 of fig. 127).
In step S9, the evaluation unit 27 of the computer functions to evaluate the equivalence between the target FP integrated characteristic value output in step S8 and the reference FP integrated characteristic value by the MT method, and outputs the evaluation result as an MD value (fig. 87 to 91) as shown in fig. 87 to 91.
Step S10 receives the MD value output in step S9, and executes "pass/fail determination".
In step S10, the evaluation unit 27 of the computer functions to compare the MD value output in step S9 with a preset threshold value (upper limit value of MD value) and determine whether or not the MD value is acceptable (evaluation result 53 in fig. 2).
S1: FP creation process (using only a single wavelength)
Fig. 95 is a flowchart showing peak information of a single wavelength in step S1 "FP creation process" in fig. 93.
FIG. 95 is a detail of a procedure for creating an FP as an evaluation target at a single wavelength, for example, 203 nm. In this processing, from the 3D chromatogram and the peak information at a detection wavelength of 203nm, the FP consisting of the retention time and peak of the peak detected at 203nm and the UV spectrum of these peaks is created.
In step S101, a process of "reading peak information" is performed. In this processing, the peak information is read, and the process proceeds to step S102 as the first of two data necessary for the creation of an FP.
In step S102, a process of sequentially acquiring the holding time (R1) of the peak and the corresponding peak data (P1) is performed. In this process, the peak holding time (R1) and the peak data (P1)1 are sequentially obtained from the peak information, and the process proceeds to step S103.
In step S103, a process of "reading 3D tomogram" is performed. In this process, the 3D tomogram is read, and the process proceeds to step S104 as the second of the two data necessary for the creation of an FP.
In step S104, a process of sequentially acquiring the peak holding time (R2) and the corresponding UV spectrum (U1) is performed. In this process, the retention time (R2) and the UV spectrum (U1) are acquired from the 3D chromatogram at each "sampling rate" in the HPLC analysis, and the process proceeds to step S105.
In step S105, "| R1-R2| <? "judgment processing. In this processing, it is determined whether or not R1 and R2 read in steps S102 and S104 correspond to a threshold range. In response to this, the two holding times are the same, and it is determined that the UV spectrum at the peak of the holding time R1 is U1, and the process proceeds to step S106. If there is no correspondence (no), the two retention times are different, and it is determined that the UV spectrum of the peak having the retention time R1 is not U1, and the process proceeds to step S104 to compare the next data of the 3D chromatogram. Further, the threshold in this determination process is "sampling rate/2 (samplinglate/2)" of the 3D tomogram.
In step S106, a process of "normalizing U1 with a maximum value of 1" is executed. In this process, U1 of the UV spectrum determined to be R1 in S105 is normalized to have a maximum value of 1, and the process proceeds to step S107.
In step S107, a process of "outputting R1 and P1 and normalized U1 (object FP)" is performed. In this processing, R1 and P1 obtained from the peak information and U1 normalized in S106 are output to the subject FP, and the process proceeds to step S108.
In step S108, "processing of all peaks is completed? "judgment processing. In this processing, it is determined whether or not all peaks in the peak information have been processed, and if the processing of all peaks has not been completed (no), the process proceeds to step S102 to process the unprocessed peaks. The processing in S102 to S108 is repeated until the processing of all peaks is completed, and once the processing of all peaks is completed (yes), the FP creation processing is completed.
S1: FP creation process (using a plurality of wavelengths)
Fig. 96 and 97 are flowcharts in the case of replacing the peak information of the single wavelength with peak information of a plurality of wavelengths in the FP creation process of step S1 in fig. 93. For example, the FP is formed by selecting a plurality of (n) wavelengths in the detection wavelength axis direction including 203 nm.
In this FP creation process, when all peaks detected in the 3D chromatogram cannot be covered with a single wavelength as shown in fig. 95, an FP covering all peaks in the 3D chromatogram is created using peak information of a plurality of wavelengths.
In addition, fig. 96 and 97 show details of a procedure of creating n FPs for each wavelength in the FP creation process using only the single wavelength, and then creating FPs formed of a plurality of wavelengths from these FPs.
In step S110, a process of "create FP for each wavelength" is executed. In this process, FP creation processing using only the single wavelength is performed for each wavelength, n FPs are created, and the process proceeds to step S111.
In step S111, a process of "tabulating FPs by the number of peaks (descending order)" is performed. In this processing, n FPs are tabulated in order of the number of peaks, and the process proceeds to step S112.
In step S112, 1 is substituted into n to initialize a counter for sequentially processing n FPs (n ← 1), and the process proceeds to step S113.
In step S113, the process of "reading the nth FP of the list" is performed. In this process, the nth FP of the list is read, and the process moves to step S114.
In step S114, "acquire all holding times (X)" is executed. In this processing, all pieces of holding time information of the FPs read in S113 are acquired, and the process proceeds to step S115.
In step S115, a process of "update of n (n ← n + 1)" is executed. In this processing, since the processing is moved to the next FP, n +1 is substituted into n as an update of n, and the process moves to step S116.
In step S116, the process of "reading the nth FP of the list" is performed. In this processing, the nth FP of the list is read, and the process moves to step S117.
In step S117, a process of "acquiring all holding times (Y)" is executed. In this processing, all the holding time information of the FP read in step S116 is acquired, and the process proceeds to step S118.
In step S118, a process of "merging (Z) X and Y without repetition" is performed. In this process, the retention time information X obtained in step S114 and the retention time information Y obtained in step S117 are combined without repetition, stored in Z, and the process proceeds to step S119.
In step S119, the process of "update of X (X ← Z)" is executed. In this process, Z stored in S118 is substituted into X, and the process proceeds to step S120 as an update of X.
In step S120, "complete FP processing is performed? "judgment processing. In this processing, it is determined whether all the n FPs created in S110 have been processed, and if so, the process proceeds to step S121. If there is an unprocessed FP (no), the process proceeds to step S115 to execute the processes of steps S115 to S120 on the unprocessed FP. The processing in steps S115 to S120 is repeated until the processing of all FPs is completed.
In step S121, 1 is substituted into n to initialize a counter for sequentially processing n FPs again (n ← 1), and the process proceeds to step S122.
In step S122, the process of "reading the nth FP of the list" is performed. In this processing, the nth FP of the list is read, and then, it moves to step S123.
In step S123, a process of "sequentially acquiring the holding time (R1), the peak data (P1), and the UV spectrum (U1) of each peak" is performed. In this process, the holding time (R1), the peak data (P1), and the UV spectrum (U1) are sequentially acquired from the FP, 1 peak read in step S122, and then the process proceeds to step S124.
In step S124, a process of "obtaining the holding time (R2) in order of X" is executed. In this processing, X stored without being repeated from the holding time of all FPs is acquired one by one for 1 holding time (R2), and the process proceeds to step S125.
In step S125, "R1 ═ R2? "judgment processing. In this processing, it is determined whether or not R1 acquired in step S123 is equal to R2 acquired in step S124, and the process proceeds to step S127. If not equal (no), the process proceeds to step S126.
In step S126, "complete holding time comparison of X is completed? "judgment processing. In this processing, it is determined whether or not comparison with the total holding time of X is completed for R1 acquired in step S123. When the processing is completed (yes), it is determined that the peak of the holding time R1 has been processed, and the process proceeds to step S123 to move the processing to the next peak. When not completed (no), the process proceeds to step S124 to move to the next hold time of X.
In step S127, a process of "add (n-1) × analysis time (T) (R1 ← R1+ (n-1) × T) on R1" is performed. In this process, the holding time of the peak of the 1 st FP present in the list with the largest number of peaks is maintained as it is, the holding time of the peak not present in the 1 st FP but present in the 2 nd FP in the list is obtained by adding the analysis time (T) to R1, and the holding time of the peak not present in the 1 st to n-1 st FPs but present in the n-th FP in the list is obtained by adding (n-1) × T to R1, and the process proceeds to step S128.
In step S128, a process of "output R1, P1, and U1 (object FP)" is performed. In this processing, R1 processed in step S127 and P1 and U1 acquired in step S123 are output to the subject FP, and the process proceeds to step S129.
In step S129, a process of "delete R2 from X" is performed. In this process, since the process of the retention time R1(═ R2) is completed in steps S127 and S128, the retention time (R2) for which the process is completed is deleted from X, and the process proceeds to S130.
In step S130, "complete peak processing? "judgment processing. In this processing, it is determined whether or not the processing has been completed for all peaks of the n-th FP in the list, and when the processing has been completed (yes), the FP creation processing of the n-th FP in the list is completed, and the process proceeds to step S131. If there is an unprocessed peak (no), the process proceeds to step S123 to process the unprocessed peak. The processing in steps S123 to S130 is repeated until all the peak processing is completed.
In step S131, a process of "n update (n ← n + 1)" is executed. In this processing, the processing moves to the next FP, substitutes n +1 for n as an update of n, and moves to step S132.
In step S132, "complete FP processing? "judgment processing. In this processing, it is determined whether all the n FPs created in S110 have completed processing, and if so (yes), the FP creation processing is completed. If there is an unprocessed FP (no), the process proceeds to step S122 to execute the processes of steps S122 to S132 on the unprocessed FP. The processing in steps S122 to S132 is repeated until the processing of all FPs is completed.
S2: object FP attribution processing 1
Fig. 98 is a flowchart showing the detailed configuration of the "target FP attribution process 1" of step S2 of fig. 93. This processing is a preprocessing for attribution, and a reference FP suitable for attribution of the target FP43 is selected from among a plurality of reference FPs serving as normal products.
In step S201, the process of "reading the object FP" is executed. In this processing, the FP of the home object is read, and the process proceeds to step S202.
In step S202, a process of "acquiring all holding times (R1)" is executed. In this processing, all the holding time information of the object FP read in S201 is acquired, and the process proceeds to step S203.
In step S203, a process of "listing the file names of all the reference FPs" is executed. In this processing, the file names of all the reference FPs are listed in advance so that all the reference FPs can be processed in sequence thereafter, and the process proceeds to step S204.
In step S204, 1 is substituted into n as an initial value of a counter for sequentially processing all the references FP (n ← 1), and the process proceeds to step S205.
In step S205, "list nth reference FP (reference FP)n) "is processed. In this processing, the nth FP in the file name list of all the reference FPs listed in S203 is read, and the process proceeds to step S206.
In step S206, a process of "acquiring all holding times (R2)" is executed. In this processing, all pieces of holding time information of the reference FP read in step S205 are acquired, and the process proceeds to step S207.
In step S207, "calculating the coincidence degree of the retention time occurrence patterns of R1 and R2 (RP)nMin) ". In this processing, RP is calculated from the holding time of the target FP acquired in S202 and the holding time of the reference FP acquired in step S206nMin, the process proceeds to step S208. Further, RPnThe detailed calculation flow of _minis additionally described by subroutine 1 of fig. 85.
In step S208, "RP" is executednPreservation of _min (RP)allMin) ". In this processing, the RP calculated in step S207 is usednStoring in RPallMin, the process proceeds to step S209.
In step S209, the process of "update of n (n ← n + 1)" is executed. In this processing, in order to move the processing to the next FP, n +1 is substituted into n as an update of n, and the flow moves to step S210.
In step S210, "complete all reference FP processing? "judgment processing. In this processing, it is determined whether all the reference FPs have been processed, and if so, the process proceeds to step S211. If an unprocessed reference FP is present (no), the processing of steps S205 to S210 is executed on the unprocessed FP, and the process proceeds to step S205. The processing in steps S205 to S210 is repeated until the processing of all the reference FPs is completed.
In step S211, a slave RP is executedallAnd (4) min is used for selecting the reference FP' with the minimum consistency. In this process, the calculated RPs are used for all reference FPslMin comparison RPnMin, the reference FP having the smallest degree of coincidence with the retention time appearance pattern of the target FP is selected, and the target FP attribution process 1 is completed.
S3: object FP attribution processing 2
Fig. 99 is a flowchart of the detailed case of the "subject FP attribution processing 2" of step S3 of fig. 93. This processing is a main attribution processing, and the degree of agreement (SCORE) between the target FP43 and the reference FP selected in step S2 is calculated from the degree of agreement between the peak pattern and the UV spectrum.
In step S301, a process of "reading the object FP" is executed. In this processing, the FP of the home object is read, and the process proceeds to step S302.
In step S302, a process of "sequentially acquiring the holding time (R1) and the peak data (P1) of the belonging object peak and the UV spectrum (U1)" is performed. In this processing, the peaks of the target FP read in step S301 are sequentially set as the belonging target peak, R1, P1, and U1 are acquired, and the process proceeds to step S303.
In step S303, the process of "reading reference FP" is executed. In this processing, the reference FP selected in the "target FP belonging process 1" in fig. 98 is read, and the process proceeds to step S304.
In step S304, a process of sequentially acquiring the retention time (R2) of the peak of the reference FP, the peak data (P2), and the UV spectrum (U2) is performed. In this processing, R2, P2, and U2 are obtained in order of 1 peak to 1 peak from the reference FP read in step S303, and the process proceeds to step S305.
In step S305, "| R1- (R2+ d) | < threshold? "judgment processing. In this processing, it is determined whether R1 and R2 read in steps S302 and S304 correspond to within the range of the threshold value. In response to this, it is determined that the peak at the holding time R2 is the belonging candidate peak of the peaks at the holding time R1, and the process proceeds to step S306 to calculate the degree of matching (SCORE) of the belonging candidate peaks. If the correspondence is not made (no), it is determined that the peak cannot be the belonging candidate peak because the difference between the holding time R2 and the holding time R1 is too large, and the process proceeds to step S309. In this determination process, d is a value for correcting the holding time of the peaks of the target FP and the reference FP, and the initial value is set to 0. The threshold is an allowable width for determining whether or not the retention time of the belonging candidate peak should be set.
In step S306, a process of "calculating the coincidence degree of UV spectrum (UV _ Sim)" is performed. In this process, UV _ Sim is calculated from U1 of the belonging peak acquired in step S302 and U2 of the belonging candidate peak acquired in step S304, and the process proceeds to step S307. The detailed calculation flow of UV _ Sim is described in subroutine 2 of fig. 86.
In step S307, a process of "calculating the matching degree of peak patterns (P _ Sim _ min)" is executed. In this processing, peak patterns are collectively created for the R1 and P1 of the belonging target peak acquired in step S302 and the R2 and P2 of the belonging candidate peaks acquired in step S304. P _ Sim _ min of these peak patterns is calculated, and the process proceeds to step S308. The detailed calculation flow of P _ Sim _ min is described in subroutine 3 of fig. 87.
In step S308, a process of "calculating the matching degree of the belonging candidate peak (SCORE)" is executed. In this process, from the UV _ Sim calculated in step S306 and the P _ Sim _ min calculated in step S307, the SCORE of the attribution target peak and the attribution candidate peak is calculated as follows
SCORE is calculated as UV _ Sim × P _ Sim _ min, and the process proceeds to step S310.
In step S309, processing of "substitute 888888 into SCORE (SCORE ← 888888)" is executed. In this processing, SCORE of a peak of the belonging candidate peak that does not match the belonging target peak is set to 888888, and the process proceeds to step S310.
In step S310, a process of "save of SCORE (SCORE _ all)" is executed. In this processing, the SCORE obtained in step S308 or S309 is saved in SCORE _ all, and the process proceeds to step S311.
In step S311, "processing of reference all peaks is completed? "judgment processing. In this processing, it is determined whether all peaks of the reference FP have been processed, and if so (yes), the process proceeds to step S312. If an unprocessed peak exists (no), the process proceeds to step S304 to step S311 to execute the process of step S304 to step S311 on the unprocessed peak. The processing of steps S304 to S311 is repeated until the processing of all peaks is completed.
In step S312, a process of "outputting SCORE _ all to the determination result file and initializing (setting to null) SCORE _ all" is executed. In this process, after the SCORE _ all is output to the determination result file, the SCORE _ all is initialized (set to null), and the process proceeds to step S313.
In step S313, "is the processing for all peaks of the subject completed? "judgment processing. In this processing, it is determined whether all peaks of the target FP have been processed, and when processed (yes), the target FP assignment processing 2 is completed. If an unprocessed peak is present (no), the process proceeds to step S302 to execute the processes of steps S302 to S313 on the unprocessed peak. The processing in steps S302 to S313 is repeated until the processing of all peaks is completed.
The output determination result file 189 is displayed in fig. 120.
S4: object FP attribution processing 3
Fig. 100 is a flowchart showing the details of the "subject FP belonging process 3" of step S4 of fig. 93. This process is an attribution post-process, and the peak of the reference FP corresponding to each peak of the target FP is specified from the agreement degree (SCORE) of the attribution candidate peaks calculated as described above.
In step S401, a process of "reading the determination result file" is performed. In this processing, the determination result file created in "target FP assigning processing 2" in fig. 81 is read, and the process proceeds to step S402.
In step S402, a process of "creating an attribution candidate peak SCORE table with data satisfying the condition of" SCORE < threshold "is executed. In this process, the assignment candidate SCORE table (see the upper assignment candidate SCORE table 191 in fig. 121) created based on the SCORE of the determination result file is moved to step S403. The attribution candidate peak SCORE table is a table in which only SCOREs smaller than a threshold value are arranged in ascending order from SCOREs calculated for all peaks of the target FP for each peak of the reference FP. The smaller the SCORE value, the higher the probability that the peak should be assigned. The threshold is an upper limit value of SCORE for determining whether or not the SCORE should be set as the belonging candidate.
In step S403, a process of "creating an attribution candidate peak number table" is executed. In this process, the process proceeds to step S404 based on the attribution candidate peak score table, and an attribution candidate peak number table (see the attribution candidate peak number table 193 in the lower part of fig. 121) is created. The attribution candidate peak number table is a table in which each score of the attribution candidate peak score table is replaced with a peak number of the object FP corresponding to the score. Thus, the table is a table in which the peak numbers of the corresponding target FP are arranged in order for the peaks of the reference FP.
In step S404, a process of "obtaining the peak number of the target FP to be attributed" is executed. In this processing, the peak number of the highest target FP is obtained for each peak of the reference FP from the belonging candidate peak number table created in step S403, and the process proceeds to step S405.
In step S405, "the acquired peak numbers are arranged in descending order (not repeated)? "judgment processing. In this processing, it is determined whether or not the peak numbers of the object FP acquired in step S404 are arranged in descending order without being repeated. In the arrangement (yes), it is determined that the peak of the object FP corresponding to each peak of the reference FP can be identified, and the process proceeds to step S408. If the alignment is not performed (no), the process proceeds to step S406 to re-evaluate the peak of the target FP to which the peak of the reference FP having the problem belongs.
In step S406, a process of "compare SCORE between problematic peaks and update the belonging candidate peak number table" is executed. In this process, SCORE corresponding to the peak number of the problematic object FP is compared with the belonging candidate SCORE table, and the belonging candidate peak number table is updated by replacing the peak number having a large SCORE with the peak number having the 2 nd position, and the process proceeds to step S407.
In step S407, a process of "updating the belonging candidate peak score table" is executed. In this processing, the belonging candidate peak score table is updated in accordance with the update content of the belonging candidate peak number table at S406, and the process proceeds to step S404. The processing of steps S404 to S407 is repeated until there is no problem with the peak number of the object FP (there is a repetition, not arranged in descending order).
In step S408, a process of "save attribution result (TEMP)" is performed. In this processing, peak data of the target FP specified as the peak number, the holding time, and the peaks of all the peaks of the reference FP and the peaks corresponding to these peaks is stored in TEMP, and the process proceeds to step S409.
In step S409, "have all peaks of the subject FP included in TEMP? "judgment processing. In this processing, it is determined whether or not peak data of all peaks of the target FP is included in the TEMP stored in step S408. If all the peaks are included (yes), it is determined that the processing is completed on all the peaks of the target FP, and the process proceeds to step S412. If there is an unincorporated peak (no), the process proceeds to step S410 to add peak data of the unincorporated peak to the TEMP.
In step S410, a process of "correcting the holding time of the peak of the target FP not including the TEMP" is executed. In this process, the retention time of the peak of the subject FP not including the TEMP (the peak of the subject FP to be corrected) is:
correction value k1+ (k2-k1) (t0-t1)/(t2-t1)
k 1: the holding time of a peak having a shorter holding time out of the peaks on the two reference FP sides belonging to the vicinity of the peak of the object FP to be corrected,
k 2: the holding time of the peak having a longer holding time out of the peaks on the two reference FP sides belonging to the vicinity of the peak of the object FP to be corrected,
t 0: the holding time of the peak of the object FP that has to be corrected,
t 1: the holding time of a peak having a shorter holding time out of the peaks on the two object FP sides belonging to the vicinity of the peak of the object FP to be corrected,
t 2: the holding time of the peak having a longer holding time out of the peaks on the two object FP sides belonging to the vicinity of the peak of the object FP to be corrected,
the holding time to the reference FP is corrected in this manner. Then, the process proceeds to step S411.
In step S411, a process of "adding the corrected holding time and the peak data of the peak to the TEMP, and updating the TEMP" is executed. In this processing, the holding time of the peak of the target FP not including the TEMP corrected in S410 is compared with the holding time of the reference FP in the TEMP, the corrected holding time of the peak of the target FP not including the TEMP and the peak data are added to an appropriate position in the TEMP to update the TEMP, and the process proceeds to step S409. The processing from step S409 to step S411 is repeated until all peaks of the object FP are added.
In step S412, a process of outputting TEMP to the comparison result file is performed. In this processing, the TEMP output in which the correspondence relationship between all the peaks of the reference FP and all the peaks of the target FP is specified is used as a comparison result file, and the target FP attribution processing 3 is completed.
S5: object FP attribution processing 4
Fig. 101 and 102 are flowcharts showing the details of the "subject FP attribution processing 4" in step S5 of fig. 93. This processing is final processing of attribution, and each peak of the target FP is attributed to a peak of the reference group FP (see data example 197 of the reference group FP in fig. 123) based on the comparison result file (see comparison result file case 195 in fig. 122) created in step S4 in fig. 93.
As described above, the reference group FP197 is an FP specifying a correspondence relationship of peaks among all reference FPs, and as shown in the reference group FP data example 197 in fig. 123, is data composed of a reference group FP peak number, a reference group holding time, and a peak height. As shown in the reference group FP45 of fig. 2, each peak can be displayed as a mean value (black dot) ± standard deviation (vertical line).
In step S501, a process of "reading the comparison result file" is performed. In this process, the comparison result file output in step S412 of fig. 100 is read, and the process proceeds to step S502.
In step S502, the process of "reading reference group FP" is executed. In this processing, the reference group FP197 of the last belonging object ( owner) of each peak of the object FP is read, and the process proceeds to step S503.
In step S503, a process of "merge and save target FP and reference group FP" (TEMP) "is executed. In this processing, the two files are merged based on the peak data of the reference FP existing in the comparison result file and the reference group FP197 together, and the result is saved as TEMP and the process proceeds to step S504.
In step S504, a process of "correcting the holding time of the peak of the target FP having no peak corresponding to the reference FP" is executed. In this processing, the holding times of all peaks of the target FP having no peak corresponding to the reference FP in the comparison result file are corrected to the holding time of TEMP stored in S503, and the process proceeds to step S505. The correction of the holding time is performed in the same manner as in step S410 of the "target FP assignment process 3" in step S4.
In step S505, a process of sequentially acquiring the corrected retention times (R1, R3) and the corresponding peak data (P1) is performed. In this processing, the holding times R1 and R3 corrected in step S504 and the peak data of the corresponding peak P1 are sequentially acquired, and the process proceeds to step S506.
In step S506, a process of sequentially acquiring the holding time (R2) of the belonging candidate peak of the target FP and the corresponding peak data (P2) by TEMP is performed. In this processing, from the TEMP stored in S503, the retention time of the target FP whose peak is not assigned is R2 and the corresponding peak data is P2 in order, and the process proceeds to step S507.
In step S507, "| R1-R2| < threshold 1? "judgment processing. In this processing, it is determined whether or not the difference between R1 and R2 acquired in S505 and S506 is smaller than a threshold value 1. If it is determined that there is a possibility that the peak having the holding time R1 of the target FP corresponds to the peak having the holding time R2 of the reference FP in the small (yes) state, the process proceeds to step S508. When the difference between R1 and R2 is equal to or greater than the threshold value 1 (no), and it is determined that there is no possibility of correspondence, the process proceeds to step S512.
In step S508, a process of "obtaining UV spectra (U1, U2) corresponding to R1, R2" is performed. In this processing, the UV spectrum corresponding to the peak of R1 and R2 is acquired from each FP at the holding time determined in step S507 to be likely to correspond, and the process proceeds to step S509.
In step S509, a process of "calculating the coincidence degree of UV spectrum (UV _ Sim)" is performed. In this process, UV _ Sim is calculated from the UV spectra U1 and U2 acquired in step S508 in the same manner as in step S306 of the "target FP assignment process 2" in step S3, and the process proceeds to step S510. Further, a detailed calculation flow of UV _ Sim is described in subroutine 2 of fig. 104.
In step S510, "UV _ Sim < threshold 2? "judgment processing. In this process, it is determined whether or not UV _ Sim calculated in S509 is smaller than the threshold value 2. If it is small (yes), it is determined that the peak of the UV spectrum U1 corresponds to the peak of U2, and the process proceeds to step S511. If UV _ Sim is equal to or larger than threshold 2 (no), it is determined that the correspondence is not established, and the process proceeds to step S507.
In step S511, the process of "R3 ← R2, threshold 2 ← UV _ Sim" is executed. In this process, after the corresponding holding time determined at S510 is R3 (i.e., R1) is updated to R2 corresponding to the holding time of the corresponding object (object to the handle), the threshold value 2 is updated to the value of UV _ Sim, and the process proceeds to step S507.
In step S512, "the holding times of all the belonging candidate peaks are compared and completed? "judgment processing. In this processing, it is determined whether or not comparison of R1 with the holding times of all the belonging candidate peaks is completed, and when the comparison is completed (yes), the process proceeds to step S513. When not completed (no), the process proceeds to step S507.
In step S513, a process of "holding R1, R3, and P1 and threshold 2(TEMP 2)" is performed. In this process, R3, which is determined to correspond to the holding time (R1) at step S510, which is updated to the holding time (R2) of the corresponding object (object) (toward hand), the corresponding peak (P1) and the current threshold 2 are stored (TEMP2), and the process proceeds to step S507.
In step S514, "the holding times of all the non-corresponding peaks are compared to be completed? "judgment processing. In this process, it is determined whether or not the comparison between the holding times of all the non-corresponding peaks and the holding times of the belonging candidate peaks is completed. When the processing is completed (yes), it is determined that all the non-corresponding peaks have been assigned, and the process proceeds to step S516. If not (no), it is determined that an unprocessed non-corresponding peak remains, and the process proceeds to step S515.
In step S515, the process of "threshold 2 ← initial value" is executed. In this processing, the threshold value 2 updated to UV _ Sim in S511 is restored to the initial value, and the process proceeds to step S505.
In step S516, "there is a peak of the same value of R3 at TEMP 2? "judgment processing. In this process, it is determined whether a plurality of non-corresponding peaks belong to the same peak in TEMP. When a non-corresponding peak belonging to the same peak exists (yes), the process proceeds to step S517. If not (no), the process proceeds to step S518.
In step S517, a process of comparing the threshold 2 of R3, which is the same peak, and restoring R3 of the peak having a larger value to the original value (R1) is performed. In this processing, the R3 value in the comparison TEMP2 is equal to the threshold value 2 of the peak, and the R3 value of the peak having a larger value is restored to the original value (i.e., R1), and the process proceeds to step S518.
In step S518, a process of "adding a peak of TEMP2 (a peak whose only the holding time of TEMP matches R3)" to TEMP "is executed. In this process, only the peak whose TEMP hold time coincides with R3 is added to the peak corresponding to R3 in TEMP, and the process proceeds to step S519. The peak where R3 does not match the holding time of TEMP is not added because the peak as the belonging object ( relative) does not exist in the reference group FP.
In step S519, a process of "outputting a peak (peak feature value profile) of the object FP in TEMP" is performed. In this processing, the peak data of the target FP belonging to the reference group FP197 is output as a peak data feature value file, and the target FP attribution processing 4 is completed.
FIG. 124 shows an example of a profile 199 of the peak data characteristic values outputted in the above-mentioned manner.
Subroutine 1
Fig. 103 is a flowchart showing details of "subroutine 1" of "reference FP selection processing" of fig. 98. This processing calculates the degree of coincidence of the retention time appearance patterns between FPs (e.g., the object FP and the reference FP).
In step S1001, the process of "x ← R1, y ← R2" is executed. In this processing, R1 and R2 acquired in S202 and S206 of fig. 98 are substituted for x and y, respectively, and the process proceeds to step S1002.
In step S1002, a process of "acquiring the number of data (a, b) of x, y" is executed. In this process, the data numbers x and y are acquired as a and b, respectively, and the process proceeds to step S1003.
In step S1003, 1 is substituted into i as an initial value of a counter for sequentially calling the retention time of x (i ← 1), and the process proceeds to step S1004.
In step S1004, a process of "obtaining the total distance (f) from the xi-th holding time" is executed. In this process, the interval between the xi-th holding time and the total holding time thereafter is acquired as f, and the process proceeds to step S1005.
In step S1005, 1 is substituted into j as an initial value of a counter for sequentially calling the retention time of y (j ← 1), and the process proceeds to step S1006.
In step S1006, a process of "obtaining the full distance (g) from the yj-th holding time" is executed. In this process, the interval between the yj-th holding time and the subsequent total holding time is taken as g, and the process proceeds to step S1007.
In step S1007, a process of "acquiring the number of data (m) satisfying the condition that the pitch | < threshold value" of the holding times of the pitch-g of the holding times of | f "is executed. In this processing, the pitches f and g of the holding times obtained in steps S1004 and S1006 are cyclically compared, and the number of data satisfying the condition that "| f pitch of each holding time-pitch of g | < threshold" is obtained as m, and the process proceeds to step S1008.
In step S1008, "calculating the coincidence degree of the hold time appearance patterns of f and g (RP)fg) "is processed. In this processing, RP is selected from a, b acquired in step S1002 and m acquired in step S1007fgTo be provided with
RPfgCalculated as (1- (m/(a + b-m))) × (a-m +1), and the process proceeds to step S1009.
In step S1009, "save RP is performedfg(RP _ all) ". In this processing, the matching degree calculated in step S1008 is stored in RP _ all, and the process proceeds to step S1010.
In step S1010, the process of "update of j (j ← j + 1)" is executed. In this process, in order to shift the process of y to the next holding time, j +1 is substituted into j as an update of j, and the process shifts to step S1011.
In step S1011, "complete processing for all holding times at y? "judgment processing. In this process, it is determined whether the process of the holding time of all y is completed. If the process is completed (yes), it is determined that the process for the entire holding time of y is completed, and the process proceeds to step S1012. In the case of incomplete (no), it is determined that an unprocessed holding time remains in y, and the process proceeds to step S1006. That is, the processing of S1006 to S1011 is repeated until all the holding times of y are processed.
In step S1012, a process of "update of i (i ← i + 1)" is executed. In this process, in order to shift the process of x to the next holding time, i +1 is substituted into i as an update of i, and the process shifts to step S1013.
In step S1013, "complete all holding time processing at x? "judgment processing. In this process, it is determined whether the process of all the holding times of x is completed. When the process is completed (yes), it is determined that the process for all the holding times of x is completed, and the process proceeds to step S1014. If not (no), it is determined that the unprocessed retention time remains in x, and the process proceeds to step S1004. That is, the processing of steps S1004 to S1013 is repeated until all the holding times of x are processed.
In step S1014, a process of "obtaining the minimum value (RP _ min) from RP _ all" is executed. In this process, the minimum value of RP _ all of the RPs storing all combinations of retention time appearance patterns of the target FP and the reference FP is acquired as RP _ min, and this RP _ min is transferred to step S207 in fig. 98, and the consistency degree calculation process of the retention time appearance patterns is completed.
Subroutine 2
Fig. 104 is a flowchart showing the details of "subroutine 2" of "subject FP belonging process 2" of fig. 99. This process calculates the degree of coincidence of the UV spectrum.
In step S2001, the processes "x ← U1, y ← U2, and z ← 0" are executed. In this process, the UV spectra U1 and U2 acquired in S302 and S304 of fig. 99 are substituted with x and y, respectively, and 0 is substituted as an initial value of the sum of squares of the distances between the UV spectra (z), and the process proceeds to step S2002.
In step S2002, a process of "acquiring x data number (a)" is executed. In this process, the number of data x is acquired as a, and the process proceeds to step S2003.
In step S2003, 1 is substituted into i as an initial value for sequentially calling the absorbance of each detection wavelength constituting the UV spectrum U1 from x, and the process proceeds to step S2004.
In step S2004, the process of "acquiring the xi-th data (b)" is executed. In this process, the ith absorbance data of x substituted into the UV spectrum U1 is acquired as b, and the process proceeds to step S2005.
In step S2005, "get ith data (c)" is executed. In this process, the ith absorbance data of y substituted into the UV spectrum U2 is acquired as c, and the process proceeds to step S2006.
In step S2006, "the sum of squares (z) of the UV spectral pitch (d) and the UV spectral pitch is calculated" is executed. In this process, the sum of the squares of the UV spectral separation d and the UV spectral separation z is calculated
d=b-c
z=z+d2The method (4) is calculated, and the process proceeds to step S2007.
In step S2007, a process of "update of i (i ← i + 1)" is executed. In this process, i +1 is substituted into i as an update of i, and the process proceeds to step S2008.
In step S2008, "complete all data processing at x? "judgment processing. In this process, it is determined whether or not the processing of all the data x and y is completed. When the processing is completed (yes), it is determined that the processing of all the data x and y is completed, and the process proceeds to step S2009. If not (no), it is determined that unprocessed data remains in x and y, and the process proceeds to step S2004. That is, the processes of S2004 to S2008 are repeated until all the absorbance data of x and y are processed.
In step S2009, a process of "calculating the coincidence degree of the UV spectra of x and y (UV _ Sim)" is performed. In this process, UV _ Sim is calculated from the data number a of the sum of squares z and x of the aforementioned UV spectral spacings and
UV _ Sim is calculated as √ (z/a), and this UV _ Sim is transferred to step S306 in fig. 99, and the process of calculating the degree of coincidence of the UV spectrum is completed.
Subroutine 3
Fig. 105 is a flowchart showing the details of "subroutine 3" of "subject FP belonging process 2" of fig. 99. This process calculates the degree of conformity of the peak pattern.
In step S3001, a process of "setting the number of peak pattern configuration candidates (m) and the number of peak pattern configuration peaks (n)" is performed. In this process, the number of peak pattern formation candidates (m) and the number of peak pattern formation peaks (n) are set as settings for comprehensively forming the peak patterns, and the process proceeds to step S3002.
In step S3002, the process of "x ← object FP name, R1 ← R1, P1 ← P1, y ← reference FP name, R2 ← R2, P2 ← P2" is executed. In this processing, the file names of the target FP and the reference FP necessary for the processing, and the holding time and the peak data acquired in S302 and S304 of fig. 99 are substituted into x, r1, p1, y, r2, and p2, respectively, and the process proceeds to step S3003.
In step S3003, a process of "acquiring all holding times (a) of x" is executed. In this process, the file (object FP) with the name x substituted in S3002 is read, and the entire retention time of the file is acquired as a, and the process proceeds to step S3004.
In step S3004, a process of "acquiring all the holding times (b) of y" is executed. In this process, the file (reference FP) with the name y substituted in S3002 is read, and the total retention time of the file is acquired as b, and the process proceeds to step S3005.
In step S3005, a process of "obtaining m holding times (cm) and peak data (dm) of the candidate peaks m by the peak pattern of r 1" is performed. In this processing, m holding times of the r1 peak pattern configuration candidate peaks to which the holding times of the belonging target peaks belong are obtained as cm and peak data is obtained as dm, respectively, by a, and the process proceeds to step S3006. The m peak pattern formation candidate peaks are m peaks r1 close to the retention time.
In step S3006, a process of "obtaining the retention time (em) and the peak data (fm) of m candidate peaks m from the peak pattern of r 2" is executed. In this processing, the holding times of m candidate peaks m by the peak pattern r2 to which the holding times of the belonging candidate peaks belong are respectively obtained as em, and the peak data is obtained as fm, and the process proceeds to step S3007. The m peak pattern formation candidate peaks are m peaks r2 close to the retention time.
In step S3007, a process of "arranging cm, dm in order of holding time (ascending order)" is performed. In this processing, cm and dm acquired in S3005 are replaced in order of increasing retention time, and the process proceeds to step S3008.
In step S3008, a process of "arranging em, fm in order of retention time (ascending order)" is executed. In this processing, em and fm acquired in step S3006 are replaced in an order of increasing retention time, and the process proceeds to step S3009.
In step S3009, a process of "sequentially acquiring the holding time (cn) and the peak data (dn) of n peaks constituting the peak pattern from cm and dm" is performed. In this processing, m cm and dm of candidate peaks are constituted by peak patterns, the holding times of n peaks constituting the peak patterns are sequentially acquired as cn, and the peak data is acquired as dn, and the process proceeds to step S3010.
In step S3010, a process of "sequentially obtaining the holding time (en) and the peak data (fn) of the n peaks constituting the peak pattern by em and fm" is performed. In this processing, em and fm of m candidate peaks are formed from the peak patterns, the holding time of n candidate peaks formed from the peak patterns is sequentially acquired as en, and the peak data is acquired as fn, and the process proceeds to step S3011.
In step S3011, a process of "calculating the degree of coincidence of peak patterns (P _ Sim)" is performed. In this processing, when the n-th and dn-th peaks are constituted by r1 and P1 of the peak to be assigned and the peak pattern thereof obtained so far, and the n-th and fn-th peaks are constituted by r2 and P2 of the peak to be assigned and the peak pattern thereof, and the coincidence degree (P _ Sim) of the peak patterns is assumed to be n-4 as shown in fig. 66, for example
P_Sim=(|p1-p2|+1)×(|(r1-(r2+d)|+1)
+(|dn1-fn1|+1)×(|(cn1-r1)-(en1-r2)|+1)
+(|dn2-fn2|+1)×(|(cn2-r1)-(en2-r2)|+1)
+(|dn3-fn3|+1)×(|(cn3-r1)-(en3-r2)|+1)
And (c) calculating (+ (| dn4-fn4| +1) × (| (cn4-r1) - (en4-r2) | +1), and moving to step S3012.
In step S3012, a process of "save P _ Sim (P _ Sim _ all)" is executed. In this processing, the P _ Sim calculated in step S3011 is sequentially stored in P _ Sim _ all, and the process proceeds to step S3013.
In step S3013, "do all combinations of m fetch n pieces in em? "judgment processing. In this processing, it is determined whether or not processing is completed in all combinations of m peak pattern configuration candidate peaks from the peak pattern configuration candidate peaks belonging to the candidate peaks, and n peak pattern configuration peaks. When the determination is completed (yes), it is determined that the calculation of the matching degree between the generation of the net peak pattern and the corresponding pattern is completed in the belonging candidate peaks, and the process proceeds to step S3014. If not, it is determined that the combination of m pieces of the n pieces of the extracted combination is not completed, and the process proceeds to step S3010. That is, the processing of S3010 to S3013 is repeated until the processing is completed in all combinations where n pieces are taken out from m pieces.
In step S3014, "do all combinations of m fetch n pieces in cm complete? "judgment processing. In this process, m candidate peaks are formed from the peak pattern of the belonging peak, and it is determined whether or not the process is completed in all combinations of n extracted peak pattern forming peaks. When the calculation is completed (yes), it is determined that the calculation of the matching degree between the generation of the netted peak pattern and the pattern is completed in the target peak, and the process proceeds to step S3015. If not, it is determined that the combination of m pieces of the n pieces of the extracted combination is not completed, and the process proceeds to step S3009. That is, the processing of S3009 to S3014 is repeated until the processing is completed in all combinations where n pieces are taken out from m pieces.
In step S3015, a process of "obtaining the minimum value (P _ Sim _ min) from P _ Sim _ all" is executed. In this process, the minimum value of P _ Sim _ all stored in S3012 is acquired as P _ Sim _ min, and this P _ Sim _ min is transferred to step S307 in fig. 99, and the peak pattern matching degree calculation process is completed.
S6: creation processing of object FP type 2
Fig. 106 is a flowchart showing the details of "FP _ type 2 creation" in step S6 of fig. 93.
In step S601, the process of "reading the object FP" is executed. In this processing, the file of the object FP43 (see the FP data instance 187 in fig. 119) is read, and the process proceeds to step S602.
In step S602, a process of "reading a peak data feature value profile" is performed. In this processing, the file of the peak data feature value (see file example 199 of the peak data feature value in fig. 124) is read with respect to the object FP43, and the process proceeds to step S603. The peak data feature value file includes, for example, peak information of the target FP43 that is attributed to the peak of the reference group FP45 by the target FP peak feature value creation unit 7.
In step S603, a process of "comparing the target FP with the peak data feature value profile" is performed. In this process, the profile of the object FP43 is compared with the profile of the peak data characteristic value. By this comparison, the remaining peaks of the object FP43 which are not included in the peaks of the reference group FP45 are specified, and the process proceeds to step S604.
In step S604, a process of "outputting the holding time and the peak data of the peak existing only in the target FP" is performed. In this process, the remaining peak retention time and peak data of the target FP43 are output to the data file of the target FP type 2 (see the reference of fig. 125 and the data example 201 of the target FP type 2).
S7: feature value processing of object FP _ type 2 by region segmentation
Fig. 107 is a flowchart showing the details of the "feature value processing of the object FP _ type 2 by region segmentation" in step S7 of fig. 94.
In step S701, a process of "setting of the region division condition of the FP space" is executed. In this processing, in order to divide the region of the target FP type 2, 1 position of the 1 st line of the vertical and horizontal lines (dividing line) is set. With this setting, as shown in fig. 76 and 77, the vertical and horizontal dividing lines (item 1) are set as dividing lines in the FP space. However, in the case of the target FP type 2, since the position of the region is not changed, the amplitude is not related. When the vertical and horizontal dividing lines (item 1) are set in step S701, the process proceeds to step S702.
In step S702, a process of "creating a region division pattern in the FP space" is executed. In this process, the positions of the division lines after the 2 nd division line are set by the combination of all the 1 st division lines of the vertical and horizontal division lines, and the division pattern (1 piece) is created. By this processing, as shown in fig. 78, the FP space is divided into regions by vertical and horizontal dividing lines. Once the area division is performed, the process proceeds to step S703.
In step S703, a process of "reading the file of the target FP _ type 2" is performed. Through this processing, the archive of the target FP type 2 is read, and the process proceeds to step S704.
In step S704, a process of "calculating total peak data of the entire FP space" is executed. In this processing, for example, as shown in fig. 79, the total height of all peaks existing in each divided lattice 145 is calculated (fig. 81), and the process proceeds to step S705.
In step S705, a process of "dividing the FP space with a division pattern" is performed. In this processing, the target FP type 2 read in step S703 is subjected to region division as shown in fig. 79 by the region division pattern set in step S702, and the process proceeds to step S706.
In step S706, a process of "calculating the existence ratio of peak data in the divided regions" is executed. In this processing, the peak existence ratio in each cell 145 is calculated by adding the feature value to the total peak height in the region and adding all the peak heights. The calculation results are shown in fig. 86. When the calculation is completed, the process proceeds to step S707.
In step S707, a process of "outputting the existence ratio of each region as a feature value" is executed. In this processing, 1 type of target FP region division feature value file is output (see 1 type of target FP region division feature value file example 203 shown in fig. 126).
S8: merging of peak data eigenvalues and region segmentation eigenvalues
Fig. 108 is a flowchart showing the detailed case of "merging of peak data feature values and region segmentation feature values" in step S8 of fig. 94.
In step S801, "reading a profile of peak data feature values" is performed. Through this processing, the same file as the file example 199 of the peak data feature value shown in fig. 124 is read, and the process proceeds to step S802.
In step S802, a process of "reading the region division feature value file" is performed. Through this processing, the target FP region division feature value file 203 shown in fig. 126 is read, and the process proceeds to step S803. In step S803, a process of "merging two characteristic value data into data of horizontal 1 line" is executed. Through this processing, the file of the peak data feature values (see file example 199 of the peak data feature values shown in fig. 124) and the file of the target FP region split feature values (see file case 203 of the target FP region split feature values shown in fig. 126) are merged into a single row of a merged file of the target FP feature values (see file case 205 of the target FP feature values shown in fig. 127), and the process proceeds to step S804.
In step S804, a process of "outputting the merged data" is performed. In this process, the object FP feature value merge profile 205 of fig. 127 is output.
Creation of reference FP attribution result feature value merged file
A reference FP feature value merged file in which the target FP feature value merged data is compared with the reference FP feature value merged data is created as shown in fig. 109 to 116.
FIGS. 109 and 110 are flow charts for creating a reference FP feature merged file, which are implemented on a computer to perform the following functions: the FP creation function of the reference FP creation unit 31, the reference FP peak assignment function of the reference FP peak assignment unit 15, the reference FP attribution result combination function of the reference FP attribution result combination unit 17, the reference FP peak feature value creation function of the reference FP peak feature value creation unit 19, the reference FP type 2 creation function of the reference FP type 2 creation unit 21, the reference FP region division feature value creation function of the reference FP region division feature value creation unit 23, and the reference FP feature value combination function of the reference FP feature value combination unit 25.
The reference FP creation function is realized in step S10001. The reference FP peak attribution function is realized by steps S10002, S10003, and S10004. The reference FP attribution result merging function is realized by step S10005. The function of creating the reference FP peak feature value is realized in step S10006. The reference FP type 2 creation function is realized by step S10007. The reference FP area division feature value creating function is realized in step S10008. The reference FP eigenvalue merging function is implemented by step S10009.
Steps S10001 to S10004 correspond to steps S1 to S4 in creating the object FP feature value merged file in fig. 93 and 94, and steps S1007 to S10009 correspond to steps S6 to S8.
Step S10001 executes "FP creation processing" using the 3D chromatogram and the peak information of the specific detection wavelength as input data.
Each of the plurality of evaluation standard drugs (standard chinese medicines) serving as an evaluation standard has a 3D chromatogram and peak data.
In step S10001, the reference FP production unit 31 (fig. 1) of the FP production unit 3 of the computer functions to produce a reference FP from the 3D chromatogram and the peak information in the same manner as the object FP43 (fig. 2), and outputs data of the reference FP as an archive.
Step S10002 receives all the reference FPs output in step S10001, and executes "reference FP assigning process 1".
In step S10002, the reference FP peak assigning unit 15 of the computer functions to select a combination from all the reference FPs and move to step S10003 in order to calculate the assignment score in order of the selected combination for all the reference FPs.
Step S10003 executes "reference FP assigning process 2" with the combination of the selected reference FPs as input.
In step S10003, peak patterns are prepared in a lattice manner as shown in fig. 23 to 61 for all peaks of the combination of reference FPs selected in step S2, and then the degree of coincidence between these peak patterns is calculated (P _ Sim in fig. 63 or 64). Further, the degree of coincidence of the UV spectrum (UV _ Sim in fig. 66) is calculated between the peaks of the selected combination of reference FPs. Then, the matching degree of the belonging candidate peaks is calculated from the matching degrees of the two peaks (SCORE in fig. 67). The calculated result is outputted as a determination result file (see determination result file case 189 of fig. 120).
Step S10004 receives the determination result file output in step S10003, and executes "reference FP assignment process 3".
In step S10004, a peak corresponding to the selected combination of reference FPs is identified according to the matching degree (SCORE) of the belonging candidate peaks among the selected combinations of reference FPs. The result is output as reference FP attribute data for each reference FP.
Step S10005 receives all the reference FP attribute data outputted in step S10004, and executes the "reference FP attribute result merge process".
In step S10005, the reference FP attribution result merging unit 17 of the computer functions to merge all the reference FP attribution data to create a reference FP correspondence table by referring to the peak correspondence relationship of each reference FP specified in the reference FP peak attribution unit 15, and the process proceeds to step S10006. In step S10006, the reference FP peak feature value creation unit 19 of the computer functions to create a peak feature value (reference group FP) from all the reference FPs based on the reference FP correspondence table created by the reference FP attribution result merging unit 17. The processing of the reference FP peak feature value creation unit 19 calculates statistics (maximum value, minimum value, median value, average value, and the like) from each peak (line) of the reference FP correspondence table, and selects a peak (line) based on the information. The selected peak (column) is output as a reference group FP (see reference group FP example 197 in fig. 123).
Step S10007 receives the reference group FP and all reference FPs output in step S10006, and executes a process of "creating FP _ type 2".
In step S10007, the reference FP type 2 generation unit 21 of the computer functions in the same manner as the target FP type 2 generation unit 9, and in the same manner as in step S6 of fig. 93, the peaks that are characterized as described above are removed from each of the plurality of reference FPs, and the FP composed of the remaining peaks and the holding time thereof is generated as the reference FP type 2 (see FP type 2 file example 201 of fig. 125).
In step S10008, "feature value processing of reference FP type 2" is executed. In this processing, the reference FP region division feature value creation unit 23 of the computer functions to create a reference FP region division feature value by region division in fig. 73 to 85. The result is output as the reference type 2 group FP (see reference type 2 group FP example 207 of fig. 128).
In step S10009, a process of "reference data creation process" is executed. In this processing, the reference FP feature value merging unit 25 of the computer functions to merge the reference group FP created by the reference FP peak feature value creation unit 19 with the reference type 2 group FP created by the reference FP region division feature value creation unit 23, and create feature value data of all the reference FPs. The result is output as reference group merged data (see reference group merged data example 209 in fig. 129).
S10005: creation of reference FP mapping Table
Fig. 111 and 112 are flowcharts showing details of "reference FP attribution result merging processing (creation of reference FP correspondence table)" in step S10005 in fig. 110.
In step S10101, a process of "reading belonging order 1 st belonging data as merged data" is performed. In this processing, reference FP attribution data which is subjected to attribution processing at step S10004 at the 1 st stage and has a correspondence relationship of a peak is read as merged data, and the process proceeds to step S10102.
In step S10102, "sequentially read the 2 nd and later items of attribution data" processing is performed. In this processing, first, in step S10004, reference FP attribution data in which item 2 is subjected to attribution processing and a correspondence relationship with a peak is specified is read as merged data, and the process proceeds to step S10103.
In step S10103, a process of "merging the merged data and the belonging data with the common peak data" is performed. In this processing, the two files are merged based on the peak data of the reference FP that is present in common in the merged data and the belonging data, the merged data is updated with the result thereof, and the process proceeds to step S10104.
In step S10104, "add all peaks in the attribution data to the merged data? "judgment processing. In this processing, it is determined whether or not all peaks of the belonging data have been added to the merged data. If the addition is completed (yes), the process proceeds to step S10105. If there is an unadditized peak (missing peak) (no), the process proceeds to step S10107 to add the missing peak to the merged data. The missing peak addition processing to the combined data (steps S10107 to S10120) is performed in the same manner as in steps S504 to S517 of S5 (target FP assignment processing 4).
In step S10121, a process of "adding data of TEMP2 to the merged data (all holding times and peaks)" is executed. In this process, the entire holding time (R3) and the peak (P1) of the TEMP2 are added to the position of the merged data, and the process proceeds to step S10122.
In step S10122, the process of "threshold 2 ← initial value, and deletion of all data in TEMP 2" is executed. In this process, the threshold value 2 updated to UV _ Sim is restored to the initial value, all the data is deleted from the TEMP2 containing the data such as the retention time of all the missing peaks and the peaks, and the process returns to step S10104.
In step S10105 advanced by step S10104, "processing of all attribution data is completed? "judgment processing. In this processing, it is determined whether or not the processing of all the reference data is completed. When the processing is completed (yes), the process proceeds to step S10106 to output the reference FP correspondence table as the merging result of the all-attribution data. If all the processes are not completed (no), the process returns to step S10102, and the remaining attribute data are sequentially processed.
In step S10106, "output merged data (reference FP corresponding table)" processing is performed. In this processing, the result of the combination of the all-attribute data is output as the reference FP table, and the reference FP table creation processing is completed.
S10006: peak feature value processing
Fig. 113 is a flowchart showing details of the "peak feature value processing (creation of reference group FP)" in step S10006 of fig. 109.
In step S10201, a process of "reading a reference FP correspondence table" is performed. In this processing, the reference FP correspondence table created in step S10005 is read, and the process proceeds to step S10202
In step S10202, a process of "calculating a statistic for each peak (column)" is performed. In this process, statistics (maximum value, minimum value, median, average value, dispersion, standard deviation, number of occurrences, and presence rate) are calculated for each peak (column) of the reference FP correspondence table, and the process proceeds to step S10203.
In step S10203, a process of "referring to the calculated statistic amount and selecting a peak (column)" is performed. In this process, a peak is selected with reference to the statistic calculated in step S10102, and the process proceeds to step S10204.
In step S10204, a process of "outputting the selected peak (column) (reference group FP)" is performed. In this processing, the result of selecting the peak (column) is output as the reference group FP based on the statistic, and the reference group FP is created.
Fig. 123 shows a reference FP correspondence table example 197 output in the above manner.
S10007: reference FP type 2 creation process
Fig. 114 is a flowchart showing details of "reference FP compilation processing (creation of reference FP _ type 2)" in step S10007 of fig. 110.
In step S10301, a process of "sequentially reading the reference FP" is performed. In this process, the files of the plurality of reference FPs (see the FP data instance 187 in fig. 119) are read, and the process proceeds to step S10302.
In step S10302, a process of "reading reference group FP" is performed. In this process, the data file of the reference group FP is read (see the data example 197 of the reference group FP in fig. 123), and the process proceeds to step S10303.
In step S10303, a process of "extracting the peak data characteristic value of the reference FP from the reference group FP" is performed. In this process, the peak data feature value of the reference FP subjected to the attribution process is extracted from the file of the reference FP45, and the process proceeds to step S10304.
In step S10304, the process of "comparing the reference FP with the extracted peak data characteristic value profile" is performed, the reference FP is compared with the peak data characteristic value profile, and the process proceeds to step S10305.
In step S10305, a process of "outputting the holding time and the peak data of the peak existing only in the reference FP" is performed, the peak of the feature value profile of the peak data is removed by the reference FP, and the process proceeds to step S10306.
In step S10306, "is the processing completed in all reference FPs? "judgment processing. In this processing, when all the reference FPs complete the processing (yes), step S10007 is completed, and if the processing is not completed in all the reference FPs (no), steps S10301 to S10305 are repeated. Therefore, the plurality of reference FPs are sequentially processed, and the peak of the peak data feature value file is removed from each reference FP to create a file of the reference FP type 2 (see the object and the data example 201 of the reference FP type 2 shown in fig. 125).
S10008: feature value processing of reference FP _ type 2 by region division
Fig. 115 is a flowchart showing details of "feature value processing of reference FP _ type 2 by region division" in step S10008 of fig. 110.
In step S10401, a process of "setting of the region division condition of the FP space" is executed. In this processing, in order to divide the region of the reference FP type 2, a plurality of positions of the 1 st line of the vertical and horizontal lines (dividing lines) are set, respectively. With this setting, as shown in fig. 76 and 77, for example, the FP space is set with a plurality of vertical and horizontal dividing lines (1 st dividing lines) 141 and 143. Once the vertical and horizontal dividing lines (item 1) 141, 143 are set to plural, the process proceeds to step S10402.
In step S10402, a process of "setting of the region division pattern of the FP space" is performed. In this process, the positions of the division lines of the 2 nd and subsequent division lines are set in all the combinations of the 1 st division line of the vertical and horizontal division lines, and division patterns (m × n pieces) are created. By this setting, as shown in fig. 78, a plurality of patterns of divided regions formed by the vertical and horizontal dividing lines 141 and 143 are set for the FP space. Once the region is divided, the process proceeds to step S10403.
In step S10403, a process of "sequentially reading the file of the reference FP _ type 2" is performed. Through this processing, the file of the reference FP type 2 is read, and the process proceeds to step S10404.
In step S10404, a process of "calculating total peak data of the entire FP space" is executed. In this processing, for example, the total height of all peaks existing in each of the divided lattices 145 shown in fig. 79 is calculated (fig. 81), and the process proceeds to step S10405.
In step S10405, a process of "sequentially dividing the FP space by each division pattern" is performed. In this process, the FP space is sequentially divided by the plurality of region division patterns set in step S10402, and the process proceeds to step S10406.
In step S10406, a process of "calculating the existence ratio of peak data in the divided regions" is executed. In this processing, for example, the total height of all peaks present in each of the divided lattices 145 shown in fig. 79 is calculated (fig. 81), and the peak presence ratio in each of the lattices 145 in fig. 79 is set as: the feature value is the sum of the peak heights in the region/the sum of all the peak heights. The calculation results are shown in fig. 83 to 85, for example. When the calculation is completed, the process proceeds to step S10408.
In step S10408, a process of "division is completed with all division patterns" is executed. In this process, it is determined whether or not the feature value processing of the plurality of all-area division patterns set in S10402 is completed. If the feature value processing is completed (yes), the process proceeds to step S10409, and if the feature value processing is not completed (no), the process proceeds to step S10405. Steps S10405 to S10408 are repeated until the feature value processing of all the region division patterns is completed.
In step S10409, "is the processing completed with all the reference FP _ type 2? "judgment processing. In this processing, it is determined whether or not the feature value processing is completed in all of the plurality of reference FP types 2 created for each of the plurality of reference FPs. If all the reference FP types 2 are completed (yes), step S10008 is completed, and if all the reference FP types 2 are not completed (no), the process proceeds to step S10403. S10403 to S10409 are repeated until the feature value processing of the reference FP type 2 is completed.
Fig. 128 shows a reference type 2 group FP example 207.
S10009: reference data creation process
Fig. 116 is a flowchart showing details of "creation processing of reference data" in step S10009 of fig. 110.
In step S10501, a process of "reading the region division feature value file" is performed. Through this processing, the reference FP area division feature value file (see reference type 2 group FP case 207 shown in fig. 128) is read, and the process proceeds to step S10502
In step S10502, a process of "calculating the number of divided patterns when the region is divided" is performed. By this processing, the number of divided patterns obtained by dividing the region is calculated. The number of divided patterns is calculated as 100, for example, as described in fig. 70 to 80. After this calculation, the process proceeds to step S10503.
In step S10503, the process of "reading reference group FP" is executed, and the process proceeds to step S10504.
In step S10504, "create a file (reference group FP2) in which a plurality of divided patterns are copied to each line of the reference group FP" is performed. In this process, the reference group FP and the area division feature value file are merged, and the column of the reference group FP is copied in accordance with the number of division patterns to create a reference group FP 2. For example, the profile example 197 of the reference group FP in fig. 123 is copied so as to correspond to the peak data characteristic value (reference group FP2) of the reference group merged data example 209 in fig. 129. After the copying, the process proceeds to step S10505.
In step S10505, a process of "merging the reference group FP2 and the region division feature value file for each line" is performed. In this processing, the data of the reference group FP2 copied in step S10504 and the data of the area division feature value file are merged for each line, and the processing proceeds to step S10506.
In step S10506, a process of "outputting the merged data" is performed. In this processing, a reference FP feature value merged file according to the merging result is output (see reference group merged data example 209 in fig. 129).
Effect of example 1
The method for evaluating a multicomponent material according to embodiment 1 of the present invention includes the FP production process 148, the target FP peak assignment process 149, the target FP peak feature value production process 151, the target FP type 2 production process 153, the target FP region division feature value production process 155, the target FP feature value merging process 157, the reference FP peak assignment process 159, the reference FP assignment result merging process 161, the reference FP peak feature value production process 163, the reference FP type 2 production process 165, the reference FP region division feature value production process 167, the reference FP feature value merging process 169, and the evaluation process 171.
The FP production process 148 includes the target FP production process 173 and a reference FP production process 175.
The target FP peak assignment process 149 includes the reference FP selection process 177, the peak pattern creation process 179, and the peak assignment process 181.
By processing the 3D chromatogram 41 of the multi-component drug to be evaluated in these 7 processes (178, 149, 151, 153, 155, 157, 171), the accuracy and efficiency of quality evaluation of the drug to be evaluated can be further improved.
In particular, a characteristic value of the peak of the object FP is generated based on the object FP43 and a plurality of reference FPs, and the object FP type 2 is made as the residual wave peak of the object FP43 which is omitted by the characteristic value, the object FP type 2 is divided into a plurality of areas, the object FP area dividing characteristic value is made from the existence rate of the wave peak existing in each area, the object FP wave peak characteristic value and the object FP area dividing characteristic value are combined to make the object FP combined characteristic value, because the object FP combined characteristic value and the reference FP combined characteristic value are compared and evaluated, since the reference FP integrated characteristic value is generated based on the plurality of reference FPs of the multi-component substance serving as the evaluation reference in correspondence with the target FP integrated characteristic value, the peaks of the target peaks not included in the target FP peak characteristic value can be included and evaluated, and the accuracy of quality evaluation of the evaluation target medicine can be reliably improved.
The object FP43 created by the object FP creation process 173 is composed of three-dimensional information (peak, retention time, and UV spectrum) in the same manner as the 3D chromatogram 41. Therefore, the data inherits the information specific to the medicine as it is. Nevertheless, since the data capacity is compressed to about 1/70, the amount of information to be processed can be greatly reduced compared to the 3D tomogram 41, and the processing speed can be increased.
The target FP creating step 173 creates an FP in which a plurality of FPs with different detection wavelengths are synthesized. Thus, even a multicomponent drug in which components of all components cannot be detected at one wavelength is combined, quality evaluation of all components can be performed by synthesizing FPs of a plurality of detection wavelengths.
The object FP creation process 173 creates an FP including all peaks detected from the 3D tomogram. Therefore, it is suitable for quality evaluation of a Chinese medicinal preparation containing multiple components.
In the reference FP selection process 177, the reference FP suitable for attribution of the target FP is compared with the retention time appearance pattern between the FPs, and the reference FP having a good pattern consistency is selected. Thus, in the peak assignment process 181, since assignment processing can be performed between FPs having similar patterns, assignment with high accuracy can be performed.
In the peak pattern creating step 179, a plurality of neighboring peaks are used for each of the belonging target peak and the belonging candidate peak, and a peak pattern is created comprehensively. Thus, even if the overall pattern of the target FP and the reference FP are slightly different from each other, the peak assignment process 181 can assign the target FP with high accuracy.
In the peak assignment step 181, the matching degree of the UV spectrum between the candidate peak to be assigned and the peak to be assigned is added to the matching degree of the peak pattern created in the peak pattern creation step 179, and the peak to be assigned is specified. Therefore, highly accurate attribution can be performed.
In the peak attribution process 181, all peaks of the object FP are simultaneously attributed to the peaks of the reference FP. Therefore, efficient attribution processing can be performed.
In the evaluation process 171, FP formed of multi-components belonging to multidimensional data are collected into one dimension as MD values by the MT method, and a plurality of evaluation target batches are simply compared and evaluated. Therefore, it is suitable for evaluating a multicomponent drug composed of a plurality of components.
The target FP region division feature value creation process 155 divides the region by a plurality of vertical division lines 141 parallel to the signal intensity axis and a plurality of horizontal division lines 143 parallel to the time axis.
Therefore, the region division can be simplified, and the processing speed can be increased.
The plurality of horizontal dividing lines 143 are set at regular intervals in the direction of increasing signal intensity.
Therefore, the region can be subdivided in the portion having a high peak density, and the peak existence rate by the region division can be efficiently calculated.
The multi-component substance evaluation method further includes the reference FP creation process 175, the reference FP peak assignment process 159, the reference FP result combination process 161, the reference FP peak feature value creation process 163, the reference FP type 2 creation process 165, the reference FP region division feature value creation process 167, and the reference FP feature value combination process 169.
Therefore, a reference FP integrated feature value obtained by integrating the reference FP peak feature value and the reference FP region division feature value can be compared with the target FP integrated feature value in the evaluation process 171, and the accuracy and efficiency of quality evaluation of the evaluation target medicine can be further improved.
In the reference FP region dividing feature value creating process 167, the position of each region may be changed, and the reference FP region dividing feature value may be created before and after the change.
Therefore, even if the holding time or the peak height fluctuates due to slight variation in analysis conditions or the like, and the value in each cell 145 fluctuates greatly in a single pattern, the amount of the peak existing in each cell 145 can be extracted regardless of the fluctuation, and the accuracy and efficiency of the quality evaluation of the drug to be evaluated can be further improved.
In the reference FP region division feature value creation process 167, the region is divided by a plurality of vertical division lines 141 parallel to the signal intensity axis and a plurality of horizontal division lines 143 parallel to the time axis.
Therefore, the region division can be simplified, and the processing speed can be increased.
The plurality of horizontal dividing lines 143 are set at regular intervals in the direction of increasing signal intensity.
Therefore, the region can be subdivided in the portion having a high peak density, and the peak existence rate by the region division can be efficiently calculated.
In the reference FP region division characteristic value creating process 167, the positions of the regions 145 are changed by changing and setting the positions so that the vertical and horizontal dividing lines 141 and 143 move in parallel within the setting range.
Therefore, the position of each region 145 can be changed efficiently by a simple process.
The evaluation program of the multi-component medicament of the embodiment of the invention realizes each function on a computer, and can further improve the accuracy and efficiency of evaluation.
The multi-component medicament evaluation device of the embodiment of the invention enables the parts 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25 and 27 to function, and further improves the evaluation accuracy and efficiency.
Variation of the calculation of the degree of coincidence of peak patterns (P _ Sim)
The calculation of the degree of coincidence (P _ Sim) of the peak patterns in fig. 63, 64, and 105 is applied to the case where FP is made with the peak height as in the above-described embodiment, and is performed based on the difference in the peak heights to be compared.
On the other hand, the peak in the method, program and apparatus for generating a characteristic value of a pattern or FP according to the present invention may include any of the case where the maximum value of the signal intensity (height) is meant and the case where the area value of the signal intensity (peak area) is expressed by the height as described above.
In this case, the peak area is also used as FP, and since FP is used as the height expression area value, FP exhibits the same expression as the case of the peak height in the above-described embodiment. Therefore, the same evaluation can be made by using the same processing of the above embodiment in the case where FP is created with the peak height.
However, when FP is made by the peak area, the difference in the peak value of the comparison object becomes large, so calculation based on the ratio is preferable in terms of ease of processing.
Hereinafter, the matching degree (P _ Sim) of the peak pattern calculated from the ratio will be described by taking a case where n is 2 and n is 4 as an example.
When n is 2
P_Sim=(p1/p2#1)×(|(r1-(r2+d)|+1)
+(dn1/fn1#1)×(|(cn1-r1)-(en1-r2)|+1)
+(dn2/fn2#1)×(|(cn2-r1)-(en2-r2)|+1)
When n is 4
P_Sim=(p1/p2#1)×(|(r1-(r2+d)|+1)
+(dn1/fn1#1)×(|(cn1-r1)-(en1-r2)|+1)
+(dn2/fn2#1)×(|(cn2-r1)-(en2-r2)|+1)
+(dn3/fn3#1)×(|(cn3-r1)-(en3-r2)|+1)
+(dn4/fn4#1)×(|(cn4-r1)-(en4-r2)|+1)
In this case, the amount of the solvent to be used,#1the ratio (large value/small value) of the two values of the comparison object is displayed.
In addition, the degree of matching of the peak pattern (P _ Sim) can be calculated by a ratio when the peak height is used as FP, and the degree of matching of the peak pattern (P _ Sim) can be obtained by a difference between the peak height and the peak area value when the peak area is used as FP.
Variation of subroutine 2
Fig. 130 is a flowchart showing a modification of the subroutine 2 applied in place of the subroutine 104, and shows details of a modification of the subroutine 2 of the target FP assigning process 2 in fig. 99. The degree of coincidence of the UV spectrum was calculated by the processing related to this modification.
In this modification of subroutine 2, processing of adding tilt information (DNS) of moving average of UV pattern to RMSD of subroutine 2 in fig. 104 may be performed. The DNS is expressed by the following equation, and the moving inclination of the moving average of the UV pattern is defined as the number of disagreements in the inclination sign (+/-) when comparing two patterns. That is, DNS is a value for evaluating the coincidence of the positions of the maximum and minimum values of the UV pattern.
By adding this DNS information to the RMSD, the degree of coincidence of the waveforms in the UV spectrum can be calculated more accurately.
The subroutine 2 of the modification of fig. 130 is substantially the same as the subroutine 2 of fig. 104 up to steps S2001 to S2008. However, in step S2001, the initial setting of the section 1 ← w1 and the section 2 ← w2 is added, and a section for calculation of a moving average or a moving inclination, which will be described later, can be used.
In the subroutine 2 of the present modification, steps S2010 to S2013 are added to add the DNS, and the matching degree of adding the DNS can be calculated in step S2009A.
In step S2010, "add DNS? The "determination process proceeds to step S2011 when it is determined that DNS is added (yes), and proceeds to step S2009A when it is determined that DNS is not added (no). Whether or not DNS is added depends on, for example, initial settings. For example, when FP is made with peak area, DNS is added, and when FP is made with peak height, DNS is not added.
However, even in the case of the above embodiment in which the FP is made by the peak height, the UV pattern matching degree can be calculated by the process of adding the DNS, and in the case of the FP made by the peak area, the UV pattern matching degree can be calculated by the process of the above embodiment in which the DNS is not added.
In step S2011, the process of "calculating the moving average of x and y in the section 1(w 1)" is executed to determine the moving average of the section 1(w 1). The section 1(w1) is a section of the wavelength of the UV data, and in the initial setting in step S2001, when w1 is 3, the section 1(3) is obtained, and the average of the UV intensities of the 3 wavelengths is obtained. Specifically, the table of fig. 131 will be described later.
In step S2012, the process of "calculating the movement inclinations of x and y in the section 2(w 2)" is executed to determine the movement inclination of the section 2(w 2). The section 2(w2) is a section of the moving average obtained in step S2011, and if w2 is 3 in the initial setting in step S2001, it is the section 2(3), and the inclination (±) covering 3 moving averages is obtained from the moving average calculated in step S2011. Specifically, the table of fig. 101 will be described later.
In step S2013, a process of "calculating the number of Disagreements (DNS) in the signs of the movement inclinations x and y" is executed, and the number of coincidences (±) in the inclination is calculated from the movement inclination calculated in step S2012. Moving the plus of the tilt, in fig. 66, represents the right shoulder up tilt, moving the tilt-represents the right shoulder down tilt.
When the process proceeds from step S2013 to step S2009A, the process of step S2009A calculates the degree of matching by adding the DNS.
In step S2009A, a process of "calculating the coincidence degree of the UV spectra of x and y (UV _ Sim)" is performed, and in the calculation of the coincidence degree added to DNS, the number a of data of the UV _ Sim obtained by summing the squares of the UV spectral intervals z and x is set to be equal to UV _ Sim √ (z/a) × 1.1.1.1, based on DNSDNSThe method of (3) moves the UV _ Sim to step S306 of fig. 81, and completes the process of calculating the degree of coincidence of the UV spectrum.
The processing from step S2010 to step S2009A is the same as that in step S2009 in fig. 86.
Fig. 131 is a graph showing a calculation example of the moving average and the moving inclination.
In fig. 131, the upper layer is an example of UV data, the middle layer is an example of calculation of moving average, and the lower layer is an example of calculation of moving tilt. In the UV data, the UV intensities are represented by a1 to a7, instead of the specific values. For example, the UV intensity at 220nm is a1, the UV intensity at 221nm is a2, etc. The example of calculation of the moving average and the example of calculation of the moving inclination also use the UV intensities a 1-a 7 instead of the specific values.
The moving average takes section 1(w1 ═ 3) as an example, and in step S2012 (fig. 130), m1, m2, · · ·, is calculated as values calculated for sections (a1, a2, a3), sections (a2, a3, a4), ·. The movement tilt is also exemplified by the interval 2(3), and in step S2013 (fig. 130), S1, · · · · · is calculated as a value calculated by the intervals (m1, m2, m3), the intervals (m2, m3, m4), · · · S. For example, the difference m3-m1 is the moving inclination, and (. + -. sup. -.) is extracted.
In this way, when creating an FP using the peak area, the UV pattern matching degree can be calculated by adding DNS in the assignment process for the reference group FP and the result merging process for reference FP assignment. By this calculation, even if the corresponding 2-point distance (dis) shown in fig. 66 is larger than the FP created with the peak height, it is easy to handle, and the UV pattern matching degree can be accurately calculated.
Others
In the method, program, and apparatus for generating characteristic values of a pattern or FP according to the embodiments of the present invention, when the FP is generated using the peak area, the signal intensity axis can be applied as the area value axis and the signal intensity can be applied as the area value axis.
The embodiment of the invention is suitable for evaluating the multi-component medicament of the traditional Chinese medicine and can also be suitable for evaluating other multi-component substances.
In this embodiment, a pattern region division feature value may be created for the target FP type 2 or the reference FP type 2, or may be created for the target FP or the reference FP.
In addition, the present invention can be widely used as long as the present invention includes a target pattern region division feature value creation process for creating a pattern region division feature value by dividing a pattern in which peaks change in time series into a plurality of regions and creating a pattern region division feature value from the presence ratio or the presence amount of the peaks present in each region.
The FP of the above example is based on all peaks in the 3D chromatogram, and may be created by removing trivial data, for example, peaks having a peak area of less than 5% in the 3D chromatogram.
The FP of the above example was produced from the peak height to obtain the evaluation of fig. 87 to 91, and when the FP was produced from the peak area, the MD value was obtained by the MT method in the same procedure as in the above example produced from the peak height, and the evaluation was obtained in the same manner as in fig. 87 to 91.
The chromatogram is not limited to a 3D chromatogram, and a chromatogram consisting of the removal of the peak of the UV spectrum and the retention time thereof may be used as the FP. At this time, the removal of the coincidence degree of the UV spectrum can be performed in the same manner as in the above-described example.
Description of the symbols
1 evaluation device for multicomponent drugs (Pattern characteristic value creation device, FP characteristic value creation device)
3FP creation part
5 target FP peak attribution part
7 target FP peak characteristic value creation unit
9 target FP type 2 creation unit
11 target FP region division characteristic value creation unit
13 object FP feature value merging unit
15 reference FP wave crest attribution part
17 reference FP attribution result merging part
19-reference FP peak feature value creation unit
21 reference FP type 2 creation section
23 reference FP region division characteristic value creation unit
25 reference FP feature value merging section
27 evaluation part
29 target FP creation part
31 reference FP creation part
33 reference FP selection part
35 peak pattern forming part
37 wave crest attribution part
39 Chinese medicine
413D chromatogram
UV Spectrum of the peaks contained in 42 object FP
43 object FP
45 reference group FP
47 object FP belonging to reference group FP
49 object FP type 2
51 object FP region segmentation characteristic value
Evaluation results of 53 subjects FP
FP of 55 agent A
FP of 57 medicament B
FP of 59 drug C
61 subject FP (retention time 10.0-14.5 points)
63. 65, 67, 69, 71, 73, 75, 77, 79, 81 peaks in the object FP (retention time 10.0-14.5 min.)
83 Standard FP (holding time 10.0-14.5 minutes)
85. 87, 89, 91, 93, 95, 97, 99, 101, 103, 105 peaks in a reference FP (retention time 10.0-14.5 minutes)
107 object FP hold time appearance Pattern
109 reference FP hold time appearance pattern
111 hold consistent number of temporal occurrence distances
113 maintaining the consistency of the time-occurrence pattern
115 peak pattern of object FP Home object Peak (3 strips)
117. 119, 121, 123 peak pattern (3 pieces) of candidate peak to which FP is assigned based on reference
Crest pattern of 125 subject FP Home subject crest (5 strips)
127. 129, 131, 133 peak pattern (5 pieces) of candidate peak to which FP is assigned based on reference
135 UV spectrum of object peak
139 assigned candidate peak UV Spectrum
141 longitudinal region dividing line
143 transverse region dividing line
145 each region (cell) divided by the vertical and horizontal region dividing lines
147 the regions are at peak heights
148FP creation Process
149 subject FP peak homing procedure
151 object FP peak feature value creation Process
153 object FP type 2 authoring Process
155 target FP region dividing feature value creation step (Pattern region dividing feature value creation step, FP region dividing feature value creation step)
157 object FP feature value merging procedure
159 reference FP peak homing procedure
161 reference FP Home result merging procedure
163 reference FP Peak feature value creation Process
165 reference FP type 2 authoring process
167 a reference FP region dividing feature value creation process (Pattern region dividing feature value creation process, FP region dividing feature value creation process)
169 reference FP eigenvalue merging process
171 evaluation procedure
173 subject FP creation Process
175 reference FP creation Process
177 reference FP selection procedure
179 peak pattern formation process
181 peak attribution process
1833 example of D tomographic data
185 example of wave information data
187FP data example
189 file example of determination result
Example of 191 attribution candidate peak scoring table
193 example of candidate peak number
Example of 195 comparison result file
197 reference group FP data example
File example of 199 object FP peak feature value
201FP type 2 data example
203 object FP region segmentation characteristic value file example
205 object FP merge eigenvalue archive instance
207 base type 2 group FP data example
209 reference group merged data example
Claims (14)
1. A method for generating a characteristic value of FP is characterized in that,
includes an FP region division feature value creation step of dividing an FP into a plurality of regions and creating FP region division feature values from the existence rate or the existence amount of peaks existing in each region,
the FP is composed of peaks detected from a chromatogram of a multi-component substance and their retention times,
in the FP area dividing feature value creating step, the positions of the respective areas are changed, and the FP area dividing feature values are created before and after the change.
2. The method for generating characteristic values of FPs according to claim 1, wherein said multi-component substance is a multi-component drug.
3. The method for generating characteristic values of FPs according to claim 2, wherein said multi-component medicine is a Chinese medicine.
4. The method for generating characteristic values of FP according to claim 2 wherein said multicomponent drug is any one of crude drugs, combinations of crude drugs, and extracts thereof.
5. The FP feature value creation method of any one of claims 1 to 4, wherein in the FP region division feature value creation process, the region is divided by a plurality of vertical dividing lines parallel to a signal intensity axis or an area value axis and a plurality of horizontal dividing lines parallel to a time axis.
6. The method of generating characteristic values of FPs according to claim 5, wherein said plurality of horizontal dividing lines are set at regular intervals in a direction of increasing signal intensity or area value.
7. The FP feature value creation method of claim 5, wherein in the FP region division feature value creation process, the positions of the respective regions are changed by changing and setting the positions so that the respective vertical and horizontal division lines are moved in parallel within a set range.
8. A characteristic value creating device for FP is characterized in that,
includes an FP region division characteristic value creation unit for dividing an FP into a plurality of regions and creating FP region division characteristic values from the existence rate or the existence amount of peaks existing in each region,
the FP is composed of peaks detected from a chromatogram of a multi-component substance and their retention times,
the FP area division feature value creating unit changes the positions of the respective areas and creates the FP area division feature values before and after the change.
9. The FP characteristic value generating device according to claim 8, wherein said multi-component substance is a multi-component drug.
10. The FP eigenvalue creation device according to claim 9, wherein said multicomponent drugs are chinese drugs.
11. The FP eigenvalue creation device according to claim 9, wherein the multicomponent drug is any of crude drugs, combinations of crude drugs, and extracts thereof.
12. The FP eigenvalue creation device according to any of claims 8 to 11, wherein the FP region division eigenvalue creation section divides the region by a plurality of vertical dividing lines parallel to a signal intensity axis or an area value axis and a plurality of horizontal dividing lines parallel to a time axis.
13. The FP's characteristic value generating device according to claim 12, wherein said plurality of horizontal dividing lines are set at regular intervals in a direction in which a signal intensity or an area value increases.
14. The FP eigenvalue creation device according to claim 12, wherein the FP region division eigenvalue creation section changes the positions of the respective regions by changing and setting the positions so that the respective vertical and horizontal division lines move in parallel within a set range.
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2011123849 | 2011-06-01 | ||
| JP2011-123849 | 2011-06-01 | ||
| PCT/JP2012/003618 WO2012164956A1 (en) | 2011-06-01 | 2012-05-31 | Creation method, creation program, and creation device for characteristic amount of pattern or fp |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| HK1181119A1 HK1181119A1 (en) | 2013-11-01 |
| HK1181119B true HK1181119B (en) | 2017-06-30 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| TWI550269B (en) | A peak attribution method, a attribution program, and a home device | |
| CN102959395B (en) | FP eigenvalue creation method and creation device | |
| CN102959396B (en) | The evaluation method of pattern, the evaluation method of multi-component compound and evaluating apparatus | |
| TWI528032B (en) | The evaluation method of the pattern, the evaluation method of the multi-component substance, the evaluation program, and the evaluation device | |
| HK1181119B (en) | Creation method, and creation device for characteristic amount of fp | |
| HK1181120B (en) | Method for evaluating pattern, method for evaluating multi-component substance, and evaluation device | |
| HK1181114B (en) | Method for attributing peaks and attribution device | |
| HK1181117B (en) | Method for evaluating pattern, method for evaluating multi-component substance, and evaluation device | |
| HK1181115A1 (en) | Fp creation method, creation device, and fp | |
| HK1181115B (en) | Fp creation method, creation device, and fp |